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(54) Heregulins (HRGs), binding proteins of P185erb2 



(57) A novel polypeptide with binding affinity for the 
p185HER2 receptor, designated heregulin-a, has been 
identified and purified from cultured human cells. DNA 
sequences encoding additional heregulin polypeptides, 
designated heregulin-a. heregulin-pl , heregulin-p2, 
heregulin-p2-like, and hereguiin-p3, have been isolated, 



sequenced and expressed. Provided herein are nucleic 
acid sequences encoding the amino acid sequences of 
heregulins useful in the production of heregulins by re- 
combinant means. Further provided are the amino acid 
sequences of heregulins and purification methods 
therefor. Heregulins and their antibodies are useful as 
therapeutic agents and in diagnostic methods. 
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Description 

BACKGROUND OF THE INVENTION 
5 Field of the Invention 

[0001] This invention relates to polypeptide ligands that bind to receptors implicated in cellular growth. In particular, 
it relates to polypeptide ligands that bind to the p1 Q5^^^^ receptor. 

10 Description of Background and Related Art 

[0002] Cellular protooncogenes encode proteins that are thought to regulate normal cellular proliferation and differ- 
entiation. Alterations in their structure or amplification of their expression lead to abnomial cellular growth and have 
been associated with carcinogenesis (Bishop JM, Sc/ence 235:305-311 [1987]); (Rhims JS, Cancer Detection and 

15 Prevention 11:139-149 [1988]); (Nowell PC, Cancer Res. 46:2203-2207 [1886]); (Nicolson GL, Cancer Hes, 47: 
1473-1487 [1 987]). Protooncogenes were first identified by either of two approaches. First, molecular characterization 
of the genomes of transfonning retroviruses showed that the genes responsible for the transfonning ability of the virus 
in many cases were altered versions of genes found in the genomes of normal cells. The normal version is the pro- 
tooncogene, which is altered by mutation to give rise to the oncogene. An example of such a gene pair is represented 

20 by the EGF receptor and the v-erb-B gene product. The virally encoded v-erb-B gene product has suffered truncation 
and other alterations that render it constitutively active and endow it with the ability to induce cellular transfomiation 
(Yarden etaL, Ann. Rev. Biochem. 57:443-478, 1988). 

[0003] The second method for detecting cellular transforming genes that behave in a dominant fashion involves 
transfection of cellular DNA from tumor cells of various species into nontransformed target cells of a heterologous 

25 species. Most often this was done by transfection of human, avian, or rat DMAs into the murine NIH 3T3 cell line (Bishop 
JM, Sc/ence 235:305-311 [1987]); (Rhims JS, Cancer Detection and Prevention ^ A :^39'^A9[^9S8])•, (Nowell PC, Can- 
cer Res. 46:2203-2207 [1 986]); (Nicolson GL, Cancer Res. 47:1 473-1 487 [1 987]); (Yarden et aL, Ann. Rev. Biochem. 
57:443-478 [1988]). Following several cycles of genomic DNA isolation and retransfection, the human or other species 
DNA was moiecularly cloned from the murine background and subsequently characterized, in some cases, the same 

30 genes were isolated following transfection and cloning as those identified by the direct characterization of transfonning 
viruses. In other cases, novel oncogenes were identified. An example of a novel oncogene identified by this transfection 
assay is the neu oncogene. It was discovered by Weinberg and colleagues in a transfection experiment in which the 
initial DNA was derived from a carcinogen-induced rat neuroblastoma (Padhy eta!., Ce// 28:865-871 [1982]); (Schech- 
ter etal., A/afure 312:51 3-51 6 [1984]). Characterization of the rat neu oncogene revealed that it had the structure of a 

35 growth factor receptor tyrosine kinase, had homology to the EGF receptor, and differed from its normal counterpart, 
the neu protooncogene, by an activating mutation in its transmembrane domain (Bargmann et al., Ce// 45:649-657 
[1986]). The human counterpart to neu is the HER2 protooncogene, also designated c-ert-B2 (Coussens etai. Science 
230: 1 1 37-1 1 39 [1 985]), W089/05692). 

[0004] The association of the HER2 protooncogene with cancer was established by yet a third approach, that is, its 

40 association with human breast cancer The HER2 protooncogene was first discovered in cDNA libraries by virtue of 
its homology with the EGF receptor, with which it shares structural similarities throughout {Yarden et al., Ann. Rev. 
Biochem. 57:443-478 [1988]). When radioactive probes derived from the cDNA sequence encoding p185^ER2 were 
used to screen DNA samples from breast cancer patients, amplification of the HER2 protooncogene was observed in 
about 30% of the patient samples (Slamon etal., Science 235:177-182 [1987]). Further studies have confirmed this 

45 original observation and extended it to suggest an important correlation between HER2 protooncogene amplification 
and/or overexpresston and worsened prognosis in ovarian cancer and non-small cell lung cancer (Slamon etal., Sc/- 
ence 244:707-712 [1989]); (Wright etai, Cancer Res 49:2087-2090, 1989); (Paik etal., J. Clin. Oncology 8^03-1^2 
[1990]); (Berchuck etaL, Cancer Res. 50:4087-4091, 1990); (Kern etaL, Cancer Hes.50:51 84-51 91 , 1990). 
[0005] The association of HER2 amptification/overexpression with aggressive malignancy, as described above, im- 

50 plies that it may have an important role in progression of human cancer; however, many tumor-related cell surface 
antigens have been described in the past, few of which appear to have a direct role in the genesis or progression of 
disease (Schlom et aL Cancer Res. 50:820-827, 1990); (Szala etaL, Proc. Natl. Acad. ScL 98:3542-3546). 
[0006] Among the protooncogenes are those that encode cellular growth factors which act through endoplasmic 
kinase phosphorylation of cytoplasmic protein. The HER1 gene (or erb-BI ) encodes the epidermal growth factor (EGF) 

55 receptor. The p-chain of platelet-derived growth factor is encoded by the c-sis gene. The granulocyte-macrophage 
colony stimulating factor is encoded by the c-fms gene. The neu protooncogene has been identified in ethylnitrosourea- 
induced rat neuroblastomas. The HER2 gene encodes the 1 ,255 amino acid tyrosine kinase receptor-like glycoprotein 
p185^^'^2 that has homology to the human epidermal growth factor receptor. 
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[0007] The known receptor tyrosine kinases all have the sanne general structural motif: an extracellular donnain that 
binds ligand, and an intracellular tyrosine kinase domain that is necessary for signal transduction and transformation. 
These two domains are connected by a single stretch of approximately 20 mostly hydrophobic amino acids, called the 
transmembrane spanning sequence. This transmembrane spanning sequence is thought to play a role in transferring 

5 the signal generated by ligand binding from the outside of the cell to the inside. Consistent with this genera! structure, 
the human pIBS^^^f^^ glycoprotein, which is located on the cell surface, may be divided into three principal portions: 
an extracellular domain, or ECD (also known as XCD); a transmembrane spanning sequence; and a cytoplasmic, 
intracellular tyrosine kinase domain. While it is presumed that the extracellular domain is a ligand receptor, the p1 BS^^^i^^ 
ligand has not yet been positively identified. 

10 [0008] No specific ligand binding to p^B5^^^'^ has been identified, although Lupu et ai, (Science 249: 
1552-1555,1989) describe an inhibitory 30 kDa glycoprotein secreted from human breast cancer cells which is alleged 
to be a putative ligand for p185^^^F^. Lupu etai, Science, 249:1552-1555 (1990); Proceedings of the American Assoc. 
tor Cancer Research, Vol 32, Abs 297, March 1991 ) reported the purification of a 30 kD factor from MDA-MB-231 cells 
and a 75 kD factor from SK-BR-3 cells that stimulates p185"^f^. The 75 kD factor reportedly induced phosphorylation 

15 of p185^^^R2 and modulated cell proliferation and colony formation of SK-BR-3 cells overexpressing the pi 85"^^ 
receptor. The 30 kD factor competes with muMAb 4D5 for binding to p1 85^^^^, its growth effect on SK-BR-3 cells was 
dependent on 30 kD concentration (stimulatory at low concentrations and inhibitory at higher concentrations). Further- 
more, it stimulated the growth of MDA-MB-468 cells (EGF-R positive, pi BS^^^^^ negative), it stimulated phosphosylation 
of the EGF receptor and it could be obtained from SK-BR-3 cells. In the rat neu system, Yarden et at., (Biochemistry, 

20 30:3543-3550, 1991) describe a 35 kDa glycoprotein candidate ligand for the neu encoded receptor secreted by ras 
transfomied fibroblasts. Dobashi et ai, Proc. Nati Acad. Set. USA, 88:8582-8586 (1991); Biochem. Biophys. Res. 
Commun.; 179: 1536-1 542 (1991) described a net; protein-specific activating factor (NAF) which is secreted by human 
T-cell line ATL-2 and which has a molecular weight in the range of 8-24 kD. A 25 kD ligand from activated macrophages 
was also described (Tarakhovsky, etal., J. Cancer Res., 2188-2196 (1991). 

25 [0009] Methods for the in vivo assay of tumors using HER2 specific monoclonal antibodies and methods of treating 
tumor cells using HER2 specific monoclonal antibodies are described in WO89/06692, 

[0010] There is a current and continuing need in the art to identify the actual ligand or ligandsthat activate p^85^^^, 
and to identify their biological role(s), including their roles in cell-growth and differentiation, cell-transformation and the 
creation of malignant neoplasms. 
30 [001 1] Accordingly, it is an object of this invention to identify and purify one or more novel pi Q5^^^^ ligand polypeptide 
(s) that bind and stimulate pi 85'^^'^2 

[0012] It is another object to provide nucleic acid encoding novel pISS^^^^^ binding ligand polypeptides and to use 
this nucleic acid to produce a p1 85^^^'^2 binding ligand polypeptide in recombinant cell culture for therapeutic or diag- 
nostic use, and for the production of therapeutic antagonists for use in certain metabolic disorders including, but not 
35 necessarily restricted to the killing, inhibition and/or diagnostic imaging of tumors and tumorigenic cells. 

[0013] It is a further object to provide derivatives and modified forms of novel glycoprotein ligands, including amino 
acid sequence variants, fusion polypeptides combining a p185^^^'^2 binding ligand and a heterologous protein and 
covalent derivatives of a p1 85'^^^2 binding ligand. 

[0014] It is an additional object to prepare immunogens for raising antibodies against pi 85^^^^ binding ligands, as 
40 well as to obtain antibodies capable of binding to such ligands, and antibodies which bind a p^SS^^^^ binding ligand 
and prevent the ligand from activating pi 85^^^. It is a further object to prepare immunogens comprising a pi 85^^^ 
binding ligand fused with an immunogenic heterologous polypeptide. 

[0015] These and other objects of the invention will be apparent to the ordinary artisan upon consideration of the 
specification as a whole. 

45 

SUMMARY OF THE INVENTION 

[0016] In accordance with the objects of this invention, we have identified and isolated novel ligand families which 
bind to p185^^^2 jhese ligands are denominated the heregulin (HRG) polypeptides, and include HRG-a, HRG-p1 , 

50 HRG-p2, HRG-p3 and other HRG polypeptides which cross-react with antibodies directed against these family mem- 
bers andlor which are substantially homologous as defined infra . A preferred HRG is the ligand disclosed in Fig. 4 and 
its fragments, further designated HRG-a. Other preferred HRGs are the ligands and theirf ragments disclosed in Figure 
8, and designated HRG-pi, HRG-P2 disclosed in Figure 12, and HRG-ps disclosed in Figure 13. 
[0017] In another aspect, the invention provides a composition comprising HRG which is isolated from its source 

55 environment, in particular HRG that is free of contaminating human polypeptides. HRG is purified by absorption to 
heparin sepharose, cation (e.g. potyaspartic acid) exchange resins, and reversed phase HPLC. 
[0018] HRG or HRG fragments (which also may be synthesized by in vitro methods) are fused (by recombinant 
expression or an in vitro peptidyl bond) to an immunogenic polypeptide and this fusion polypeptide, in turn, is used to 
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raise antibodies against an HRG epitope. Anti-HRG antibodies are recovered from the serunn of immunized animals. 
Alternatively, monoclonal antibodies are prepared from cells in vitro or from in vivo immunized animals in conventional 
fashion. Preferred antibodies identified by routine screening will bind to HRG, but will not substantially cross-react with 
any other known ligands such as EGF, and will prevent HRG from activating p185'^E'^2_ |n addition, anti-HRG antibodies 
5 are selected that are capable of binding specifically to individual family members of the HRG family, e.g. HRG-a, HRG- 
p1 , HRG-p2, HRG-p3, and thereby may act as specific antagonists thereof. 

[0019] HRG also is derivatized in vitroXo prepare immobilized HRG and labeled HRG, particularly for purposes of 
diagnosis of HRG or its antibodies, or for affinity purification of HRG antibodies. Immobilized antl-HRG antibodies are 
useful in the diagnosis (in vitro or in vivo) or purification of HRG. In one preferred embodiment, a mixture of HRG and 
10 other peptides is passed over a column to which the anti-HRG antibodies are bound. 

[0020] Substitutional, deletional, or insertional variants of HRG are prepared by in vitro or recombinant methods and 
screened, for example, for immuno-crossreactivity with the native forms of HRG and for HRG antagonist or agonist 
activity. 

[0021] In another preferred embodiment, HRG is used for stimulating the activity of p^8S^^^^ in nornial cells. In 
15 another preferred embodiment, a variant of HRG is used as an antagonist to inhibit stimulation of p185^^R2. 

[0022] HRG, its derivatives, or its antibodies are formulated into physiologically acceptable vehicles, especially for 

therapeutic use. Such vehicles include sustained-release formulations of HRG or HRG variants. A composition is also 

provided comprising HRG and a phannaceutically acceptable carrier, and an isolated polypeptide comprising HRG 

fused to a heterologous polypeptide. 
20 [0023] In still other aspects, the invention provides an isolated nucleic acid encoding an HRG, which nucleic acid 

may be labeled or unlabeled with a detectable moiety, and a nucleic acid sequence that is complementary, or hybridizes 

under stringent conditions to, a nucleic acid sequence encoding an HRG. 

[0024] The nucleic acid sequence is also useful in hybridization assays for HRG nucleic acid and in a method of 
determining the presence of an HRG, comprising hybridizing the DNA (or RNA) encoding (or complementary to) an 
25 HRG to a test sample nucleic acid and detennining the presence of an HRG. The invention also provides a method of 
amplifying a nucleic acid test sample comprising priming a nucleic acid polymerase (chain) reaction with nucleic acid 
(DNA or RNA) encoding (or complementary to) a HRG. 

[0025] In still further aspects, the nucleic acid is DNA and further comprises a replicable vector comprising the nucleic 
acid encoding an HRG operably linked to control sequences recognized by a host transfomned by the vector; host cells 
30 transformed with. the vector; and a method of using a nucleic acid encoding an HRG to effect the production of HRG, 
comprising expressing HRG nucleic acid in a culture of the transformed host cells and recovering an HRG from the 
host cell culture. 

[0026] In further embodiments, the invention provides a method for producing HRG comprising inserting into the 
DNA of a cell containing the nucleic acid encoding an HRG a transcription modulatory element in sufficient proximity 

35 and orientation to an HRG nucleic acid to influence (suppress or stimulate) transcription thereof, with an optional further 
step comprising culturing the cell containing the transcription modulatory element and an HRG nucleic acid. 
[0027] In still further embodiments, the invention provides a cell comprising the nucleic acid encoding an HRG and 
an exogenous transcription modulatory element in sufficient proximity and orientation to an HRG nucleic acid to influ- 
ence transcription thereof; and a host cell containing the nucleic acid encoding an HRG operably linked to exogenous 

40 control sequences recognized by the host cell. 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0028] 

45 

Figure 1 Purification of Heregulin on PolyAspartic Acid column. 

[0029] PolyAspartic acid column chromography of heregulin-a was conducted and the elution profile of proteins 
measured at A2-14. The 0.6 M NaCI pool from the heparin Sepharose purification step was diluted to 0.2 M NaCI with 
50 water and loaded onto the polyaspartic acid column equilibrated in 17 mM Na phosphate, pH 6.8 with 30% ethanol A 
linear NaCI gradient from 0.3 to 0.6 M was initiated at 0 time and was complete at 30 minutes. Fractions were tested 
in HRG tyrosine autophosphorylation assay. The fractions corresponding to peak C were pooled for further purification 
on C4 reversed phase HPLC. 

55 Figure 2 C4 Reversed Phase Purification of Heregutin-2. 

Panel A: Pool C from the polyaspartic acid column was applied to a G4 HPLC column (SynChropak 
RP-4) equilibrated in 0.1% TFA and the proteins etuted with a linear acetonitrile gradient at 0.25%/ 
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minute. The absorbance trace for the run nunnbered C4-17 is shown. One milliliter fractions were 
collected for assay. 

Panel B: Ten microliter aliquots of the fractions were tested in HRG tyrosine autophosphorylation 
assay. Levels of phosphotyrosine in the p185'^^'^ protein were quantitated by a specific antiphos- 

5 photyrosine antibody and displayed in arbitrary units on the abscissa. 

Panel C: Ten microliter fractions were taken and subjected to SDS gel electrophoresis on 4-20% 
acrylamide gradient gets according to the procedure of Laemmli {Nature, 227:680-685,1970). The 
molecular weights of the standard proteins are indicated to the left of the lane containing the stand- 
ards. The major peak of tyrosine phosphorylation activity found in fraction 17 was associated with a 

10 prominent 45,000 Da band (HRG-a). 

Figure 3. SDS Polyacrylamide Gel Showing Purification of Heregulin-a. 

[0030] Molecular weight markers are shown in Lane 1 . Aliquots from the MDA-MB-231 conditioned media (Lane 2), 
15 the 0.6M NaCl pool from the heparin Sepharose column (Lane 3), Pool C from the polyaspartic acid column (Lane 4) 
and Fraction 17 from the HPLC column (C4-17) (Lane 5) were electrophoresed on a 4-20% gradient gel and silver 
stained. Lanes 6 and 7 contained buffer only and shows the presence of gel artifacts in the 50-65 KDa molecular weight 
region. 

20 Figures 4a-4d depict the deduced amino acid sequence of the cDNA contained in A.gt^oh®^''^ (SEQ ID N0:12 and 

SEQ ID N0:13). The nucleotides are numbered at the top left of each line and the amino acids written in three 
letter code are numbered at the bottom left of each line. The nucleotide sequence corresponding to the probe is 
nucleotides 681 -720. The probable transmembrane domain is amino acids 287-309. The six cysteines of the EGF 
motif are 226, 234, 240, 254, 256 and 265. The five potential three-amino acid N-linked glycosylation sites are 

25 164-166, 170-172, 208-210, 437-439 and 609-611. The serine-threonine potential O-glycosylation sites are 

209-221 , Serine-glycine dipeptide potential glycosaminoglycan addition sites are amino acids 42-43, 64-65 and 
151-152. The initiating methionine(MET) is at position #45 of figure 4 although the processed N-terminal residue 
is S46. 

Figure 5 Northern blot analysis of MDA-MB-231 and SKBR3 RNAs Labeled from left to right are the following: 1 ) 
30 IVIDA-I\/1B-231 polyA minus-RNA, (RNA remaining after polyA-containing RNA is removed); 2) MDA-MB-231 polyA 

plus-mRNA (RNA which contains polyA); 3) SKBR3 polyA minus-RNA; and, 4) SKBR3 polyA plus-mRNA. The 
probe used for this analysis was a radioactively (^^P) labelled internal xho1 DNA restriction endonuclease fragment 
from the cDNA portion of >,gt10her16. 

Figure 6 Sequence Comparisons in the EGF Family of Proteins. 

35 Sequences of several EGF-like proteins (SEQ ID NOS: 14, 15, 16, 17, 18, and 19) around the cysteine domain 

are aligned with the sequence of HRG-a. The location in figure 6 of the cysteines and the invariant glycine and 
arginine residues at positions 238 and 264 clearly show that HRG-a is a member of the EGF family. The region 
in figure 6 of highest amino acid identity of the family members relative to HRG-a (30-40%) is found between Cys 
234 and Cys 265. The strongest identity (40%) is with the heparin-binding EGF (HB-EGF) species. HRG-a has a 

40 unique 3 amino acid insert between Cys 240 and Cys 254. Potential transmembrane domains are boxed (287-309). 

Bars indicate the carboxy-temriinal sites for EGF and TGF-alpha where proteolytic cleavage detaches the mature 
growth factors from their transmembrane assodated preforms. HB-EGF is heparin binding-epidennal growth factor; 
EGF is epidermal growth factor; TGF-alpha is transfomning growth factor alpha; and schwannoma is the schwan- 
noma-derived growth factor. The residue numbers in Fig. 6 reflect the Fig. 4 convention. 

45 Figure 7 Stimulation of Cell Growth by HRG-a. 

Three different cell lines were tested for growth responses to 1 nM HRG-a. Cell protein was quantitated by 
crystal violet staining and the responses nomnalized to control, untreated cells. 

Figures 8a-8cl (SEQ ID N0:7) depict the entire potential coding DNA nucleotide sequence of the heregulin-pl 
and the deduced amino acid sequence of the cDNA contained in Xher 11.1db! (SEQ ID NO:9). The nucleotides 

50 are numbered at the top left of each line and the amino acids written in three letter code are numbered at the 

bottom left of each line. The probable transmembrane amino acid domain is amino acids 278-300. The six cysteines 
of the EGF motif are 212, 220, 226, 240, 242 and 251. The five potential three-amino acid N-linked glycosylation 
sites are 150-152, 156-158, 196-198, 428-430 and 600-612. The serine-threonine potential O-glycosylation sites 
are 195-207. Serine-glycine dipeptide potential glycosaminoglycan addition sites are amino acids 28-29, 50-51 

55 and 137-138. The initiating methionine (MET) is at position #31. HRG-p1 is processed to the N-temiinal residue 

S32. 

Figure 9 depicts a comparison of the amino acid sequences of heregulin-a and -p1 . A dash (-) indicates no amino 
acid at that position. (SEQ ID N0:8 and SEQ ID N0:9). This Fig. uses the numbering convention of Figs. 4 and 6. 
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Figure 10 shows the stimulation of HER2 autophosphorylation using reconnbinant HRG-a as measured by HER2 
tyrosine phosphorylation. 

Figure 11 depicts the nucleotide and inputed amino acid sequence of Xl5'her13 (SEQ ID NO:22); the amino acid 
residue numbering convention is unique to this figure. 
5 Figure 12a-12e depict the nucleotide sequence of A,her76, encoding heregultn-p2 (SEQ ID NO:23). This figure 

commences amino acid residue numbering with the expressed N-terminat MET; the N-tenninus is S2. 
Figures 13a-13c depict the nucleotide sequence of ^her78, encoding heregulin-p3 (SEQ ID NO:24). This figure 
uses the amino acid numbering convention of Fig. 12; S2 is the processed N-terminus. 

Figures 14a-14d depict the nucleotide sequence of Xher84, encoding a heregulin-p2-like polypeptide (SEQ ID 
10 N0:25). This figure uses the amino acid numbering convention of Fig. 12; S2 is the processed N-tenninus. 

Figure 15a-15c depict the amino acid homologies between the known heregulins (a, pi, p2, p2-like and p3 in 
descending order) and illustrates the amino acid insertions, deletions or substitutions that distinguish the different 
fonns (SEQ ID NOS:26-30). This figure uses the amino acid numbering convention of Figs. 12-14. 

15 DETAILED DESCRIPTiON OF THE PREFERRED EMBODIMENTS 

1. Definitions 

[0031] In general, the following words or phrases have the indicated definition when used in the description, exam- 
20 pies, and claims. 

[0032] Heregulin ("HRG") is defined herein to be any isolated polypeptide sequence which possesses a biological 
activity of a polypeptide disclosed in Figs. 4, 8, 12, 13, or 15, and fragments, alleles or animal analogues thereof or 
their animal analogues. HRG excludes any polypeptide heretofore identified, including any known polypeptide which 
is othenvise anticipatory under 35 U.S.C. 102, as well as polypeptides obvious over such known polypeptides under 

25 35 U.S.C. 103. including in particular EFG, TFG-a, amphiregulin (Plowman et al. Mol. Cell. Bioi 10:1969 (1990), HB- 
EGF (Higashimaya et al., Science 251 :936 [1 991)), schwannoma factor or polypeptides obvious thereover. 
[0033] "Biological activity" for the purposes herein means an in vivo effector or antigenic function that is directly or 
indirectly perfonned by an HRG polypeptide (whether in its native or denatured conformation), or by any subsequence 
thereof. Effector functions include receptor binding or activation, induction of differentiation, mitogenic or growth pro- 

30 moting activity, immune modulation, DNA regulatory functions and the like, whether presently known or inherent. An- 
tigenic functions include possession of an epitope or antigenic site that is capable of cross-reacting with antibodies 
raised against a naturally occumng or denatured HRG polypeptide or fragment thereof. 

[0034] Biologically active HRG includes polypeptides having both an effector and antigenic function, or only one of 
such functions. HRG includes antagonist polypeptides to HRG, provided that such antagonists include an epitope of 
35 a native HRG. A principal known effector function of HRG is its ability to bind to p185^^f^2 and activate the receptor 
tyrosine kinase. 

[0035] HRG Includes the translated amino acid sequence of full length human -HRGs (proHRG) set forth herein in 
the Figures; deglycosylated or un glycosylated derivatives; amino acid sequence variants; and covalent derivatives of 
HRG, provided that they possess biological actvity. While the native profomn of HRG is probably a membrane-bound 
40 polypeptide, soluble forms, such as those forms lacking a functional transmembrane domain (proHRG or its fragments), 
are also included within this definition. 

[0036] Fragments of intact HRG are included within the definition of HRG. Two principal domains are included within 
the fragments. These are the growth factor domain ("GFD"), homologous to the EGF family and located at about 
residues S216-A227 to N268-R2B6 (Fig, 9, HRG-a; the GFD domains for other HRGs (Fig. 15) are the homologous 
45 sequences.). Preferably, the GFDs for HRG-a, P^, Pg, p2-like and P3 are, respectively, G175-K241, G175-K246, 
G175-K238, G175-K238 and G175-E241 (Fig. 15). 

[0037] Another fragment of interest is the N-tenminal domain ("fsTTD"). The NTD extends from the N-terminus of 
processed HRG (S2) to the residue adjacent to an N-terminal residue of the GFD, i.e., about T172-C182 (Fig. 15) and 
preferably T174. An additional group of fragments are NTD-GFD domains, equivalent to the extracellular domains of 

50 HRG-a and P1-P2. Another fragment is the C-tenminal peptide ("CTP") located about 20 residues N-tenninal to the first 
residue of the transmembrane domain, either alone or in combination with the C-tenminal remainder of the HRG. 
[0038] In preferred embodiments, antigenically active HRG is a polypeptide that binds with an affinity of at least about 
10"^ I/mole to an antibody raised against a naturally occurring HRG sequence. Ordinarily the polypeptide binds with an 
affinity of at least about 10^ I/mole. Most preferably, the antigenically active HRG is a polypeptide that binds to an 

55 antibody raised against one of HRGs in its native confonnation. HRG in its native conformation generally is HRG as 
found in nature which has not been denatured by chaotropic agents, heat or other treatment that substantially modifies 
the three dimensional structure of HRG as determined, for example, by migration on nonreducing, nondenaturing sizing 
gels. Antibody used in this detennination is rabbit polyclonal antibody raised by formulating native HRG from a non- 
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rabbit species In Freund's complete adjuvant, subcutaneously injecting the formulation Into rabbits, and boosting the 
immune response by intraperitoneal injection of the fomnulation until the titer of anti-HRG antibody plateaus. 
[0039] Ordinarily, biologically active HRG will have an amino acid sequence having at least 75% amino acid sequence 
identity with an HRG sequence, more preferably at least 80%, even more preferably at least 90%, and most preferably 

5 at least 95%, Identity or homology with respect to an HRG sequence is defined herein as the percentage of amino acid 
residues In the candidate sequence that are identical with HRG residues In Figs. 15, after aligning the sequences and 
introducing gaps, if necessary, to achieve the maximum percent homology, and not considering any conservative sub- 
stitutions to be identical residues. None of N-temninal, C-temninal or internal extensions, deletions, or insertions into 
HRG sequence shall be construed as affecting homology. 

10 [0040] Thus, the biologically active HRG polypeptides that are the subject of this invention include each expressed 
or processed HRG sequence; fragments thereof having a consecutive sequence of at least 5, 10, 15, 20, 25, 30 or 40 
amino acid residues; amino acid sequence variants of HRG wherein an amino acid residue has been Inserted N- or 
C-temninal to, or within, HRG sequence or Its fragment as defined above; amino acid sequence variants of HRG se- 
quence or its fragment as defined above wherein a residue has been substituted by another residue. HRG polypeptides 

15 include those containing predetemnined mutations by, e.g., site-directed or PGR mutagenesis. HRG includes HRG 
from such as species as rabbit, rat, porcine, non-human primate, equine, murine, and ovine HRG and alleles or other 
naturally occurring variants of the foregoing; derivatives of HRG or its fragments as defined above wherein HRG or its 
fragments have been covalently modified by substitution, chemical, enzymatic, or other appropriate means with a 
moiety other than a naturally occurring amino acid (for example a detectable moiety such as an enzyme or radioisotope); 

20 glycosylation variants of H RG (insertion of a glycosylation site or deletion of any glycosylation site by deletion, insertion 
or substitution of an appropriate residue); and soluble forms of HRG, such as HRG-GFD or those that lack a functional 
transmembrane domain. 

[0041 ] Of particular interest are fusion proteins that contain H RG-NTD but are free of the GFD ordinarily associated 
with the HRG-NTD in question. The first 23 amino acids of the NTD are dominated by charged residues and contain 

25 a sequence (GKKKER; residues 13-18, Fig. 15) that closely resembles the consensus sequence motif for nuclear 
targeting (Roberts, Biochim. Biophys. Acta. 1008 :263 [1 989]). Accordingly, the HRG includes fusions in which the NTD, 
or at least a polypeptide comprising its first about 23 residues, is fused at a terminus to a non-HRG polypeptide or to 
a GFD of another HRG family member. The non-HRG polypeptide in this embodiment is a regulatory protein, a growth 
factor such as EGF orTGF-a, or a polypeptide Ilgand that binds to a cell receptor, particularly a cell surface receptor 

30 found on the surface of a cell whose regulation Is desired, e.g. a cancer cell. 

[0042] In another embodiment, one or more of residues 13-18 independently are varied to produce a sequence 
incapable of nuclear targeting. For example G13 is mutated to any other naturally occun'ing residue including P, L, I, 
V, A, M, F, K, D or S; any one or more of K14-K16 are mutated to any other naturally occurring residue including R,H, 
D,E,N or Q; El 7 to any other naturally occurring residue including D, R, K, H, N or O; and R18 to any other naturally 

35 occurring residue including K, H, D, E, N or Q. All or any one of residues 13-18 are deleted as well, or extraneous 
residues are inserted adjacent to these residues; for example residues inserted adjacent to residue 13-18 which are 
the same as the above- suggested substitutions for the residues themselves. 

[0043] In another embodiment, enzymes or a nuclear regulatory protein such as a transcriptional regulatory factor 
is fused to HRG-NTD, HRG-NTD-GFD, or HRG-GFD. The enzyme or factor is fused to the N- or C-temiinus, or inserted 
40 between the NTD and GFD domains, or is substituted for the region of NTD between the first about 23 residues and 
the GFD. 

[0044] "Isolated" HRG means HRG which has been identified and is free of components of its natural environment. 
Contaminant components of its natural environment include materials which would interfere with diagnostic or thera- 
peutic uses for HRG, and may include proteins, honnones, and other substances. In preferred embodiments, HRG will 

45 be purified (1) to greater than 95% by weight of protein as detennined by the Lowry method or other validated protein 
determination method, and most preferably more than 99% by weight, (2) to a degree sufficient to obtain at least 15 
residues of N-terminal or internal amino acid sequence by use of the best commercially available amino acid sequenator 
mari<eted on the filing date hereof, or (3) to homogeneity by SDS-PAGE using Coomassie blue or, preferably, silver 
stain. Isolated HRG includes HRG in situ within heterologous recombinant cells since at least one component of HRG 

50 natural environment will not be present. Isolated HRG includes HRG from one species in a recombinant cell culture of 
another species since HRG in such circumstances will be devoid of source polypeptides. Ordinanly, however. Isolated 
HRG will be prepared by at least one purification step. 

[0045] In accordance with this Invention, HRG nucleic acid is RNA or DNA containing greater than ten bases that 
encodes a biologically or antigenically active HRG, is complementary to nucleic acid sequence encoding such HRG; 
55 or hybridizes to nucleic acid sequence encoding such HRG and remains stably bound to it under stringent conditions. 
[0046] Preferably, HRG nucleic acid encodes a polypeptide sharing at least 75% sequence identity, more preferably 
at least 80%, still more preferably at least 85%, even more preferably at 90%, and most preferably 95%, with an HRG 
sequence. Preferably, the HRG nucleic acid that hybridizes contains at least 20, more preferably at least about 40, 
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and most preferably at least about 90 bases. Such hybridizing or complementary nucleic acid, however, is further 
defined as being novel under 35 U.S.C. 102 and unobvious under 35 U.S.C. 103 over any prior art nucleic acid and 
excludes nucleic acid encoding EGF, TGF-a, amphiregulin, HB-EGF, schwannoma factor or fragments or variants 
thereof which would have been obvious as of the filing date hereof. 

[0047] Isolated HRG nucleic acid includes a nucleic acid that is free from at least one contaminant nucleic acid with 
which it is ordinarily associated in the natural source of HRG nucleic acid. Isolated HRG nucleic acid thus is present 
in other than in the form or setting in which it is found in nature. However, isolated HRG encoding nucleic acid includes 
HRG nucleic add in ordinarily HRG-expressing cells where the nucleic acid is in a chromosomal location different from 
that of natural cells oris otherwise flanked by a different DNA sequence than that found in nature. Nucleic acid encoding 
HRG may be used In specific hybridization assays, particularly those portions of HRG encoding sequence that do not 
hybridize with other known DNA sequences, for example those encoding the EGF-like molecules of figure 6. 
[0048] "Stringent conditions" are those that (1) employ low ionic strength and-high temperature for washing, for 
example, 0.015 M NACI/0.0015 M sodium citrate/0/1% NaDodS04 at 50** C; (2) employ during hybridization a dena- 
turing agent such as formamide, for example, 50% (vol/vol) fomnamide with 0.1% bovine serum albumin, 0.1% Ficoll, 
0.1% polyvinylpyrrolidone, 50 mM sodium phosphate buffer at pH 6.5 with 750 mM NaCI, 75 mM sodium citrate at 42** 
C; or (3) employ 50% fomnamide, 5 x SSC (0.75 M NaCI, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 
0.1% sodium pyrophosphate, 5 x Denhardt's solution, sonicated salmon spenm DNA (50 g/ml), 0.1% SDS, and 10% 
dextran sulfate at 42°C, with washes at 42°C in 0.2 x SSC and 0.1% SDS. 

[0049] Particular HRG-a nucleic acids are nucleic acids or oligonucleotides consisting of or comprising a nucleotide 
sequence selected from Figs. 4a-4d and containing greater than 17 bases (when excluding nucleic acid sequences of 
human small polydisperse circular DNA (HUM PCI 25), chicken c-mos proto-oncogene homolog (CHKMOS), basement 
membrane heparin sulfate proteoglycan (HUMBMHSP) and human lipocortin 2 pseudogene (complete cds-like region, 
HUML1P2B), ordinarily greater than 20 bases, preferably greater than 25 bases, together with the complementary 
sequences thereof. 

[0050] Particular HRG-p^, -p2 or -pg nucleic acids are nucleic acids or oligonucleotides consisting of or comprising 
a nucleotide sequence selected from Figs. 8a-8d, 12a-12e or 13a-13c and containing greater than 20 bases, but does 
not include the polyA sequence found at the 3' end of each gene as noted in the Figures, together with the complements 
to such sequences. Preferably the sequence contains contains greater than 25 bases. HRG-p sequences also may 
exclude the human small polydisperse circular DNA sequence (HUMP-C125). 

[0051] In other embodiments, the HRG nucleotide sequence contains a 15 or more base HRG sequence and is 
selected from within the sequence encoding the HRG domain extending from the N-terminus of the GFD to the N- 
terminus of the transmembrane sequence (or the complement of that nucleic acid sequence). For example, with respect 
to HRG-a, the nucleotide sequence is selected from within the sequence 678-869 (Fig. 4b) and contains a sequence 
of 15 or more bases from this section of the HRG nucleic acid. 

[0052] In other embodiments, the HRG nucleic acid sequence is greater than 14 bases and is selected from a nu- 
cleotide sequence unique to each subtype, for instance a nucleic acid sequence encoding an amino acid sequence 
that is unique to each of the HRG subtypes (or the complement of that nucleic acid sequence). These sequences are 
useful in diagnostic assays for expression of the various subtypes, as well as specific amplification of the subtype DNA. 
For example, the HRG-a sequence of interest would be selected from the sequence encoding the unique N-terminus 
or GFD-transmembrane joining sequence, e.g. about bp771-860. Similarly, a unique HRG-p., sequence is that which 
encodes the last 15 C-tenninal amino acid residues; this sequence is not found in HRG-a. 

[0053] in general, the length of the HRG-a or p sequence beyond greater than the above-indicated number of bases 
Is immaterial since all of such nucleic acids are useful as probes or amplification primers. The selected HRG sequence 
may contain additional HRG sequence, either the nornial flanking sequence or other regions of the HRG nucleic acid, 
as well as other nucleic acid sequences. For purposes of hybridization, only the HRG sequence is material. 
[0054] The term 'control sequences" refers to DNA sequences necessary for the expression of an operably linked 
coding sequence in a particular host organism. The control sequences that are suitable for prokaryotes, for example, 
include a promoter, optionally an operator sequence, a ribosome binding site, and possibly, other as yet poorly under- 
stood sequences. Eukaryotic cells are known to utilize promoters, polyadenylation signals, and enhancers. 
[0055] Nucleic acid is "operably linked" when it Is placed into a functional relationship with another nucleic acid 
sequence. For example, DNA for a presequence or secretory leader is operably linked to DNA for a polypeptide if it is 
expressed as a preprotein that participates in the secretion of the polypeptide; a promoter or enhancer is operably 
linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked 
to a coding sequence if it is positioned so as to facilitate translation. Generally, 'operably linked" means that the DNA 
sequences being linked are contiguous and, in the case of a secretory leader, contiguous and in reading phase. How- 
ever enhancers do not have to be contiguous. Linking is accomplished by ligation at convenient restriction sites. If 
such sites do not exist, then synthetic oligonucleotide adaptors or linkers are used in accord with conventional practice. 
[0056] An "exogenous" element is defined herein to mean nucleic acid sequence that is foreign to the cell, or homol- 
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ogous to the cell but in a position within the host cell nucleic acid in which the element Is ordinarily not found. 
[0057] As used herein, the expressions "cell", "cell line", and "cell culture" are used interchangeably, and all such 
designations include progeny. Thus, the words "transfonmants" and "transformed cells" include the primary subject cell 
and cultures derived therefrom without regard for the number of transfers. It is also understood that all progeny may 
5 not be precisely identical in DNA content, due to deliberate or inadvertent mutations. Mutant progeny that have the 
same function or biological activity as screened for in the originally transformed cell are included. It will be dear from 
the context where distinct designations are intended. 

[0058] "Plasmids" are designated by a lower case "p" preceded and/or followed by capital letters and/or numbers. 
The starting plasmids herein are commercially available, are publicly available on an unrestricted basis, or can be 
10 constructed from such available plasmids in accord with published procedures. In addition, other equivalent plasmids 
are known in the art and will be apparent to the ordinary artisan. 

[0059] "Restriction Enzyme Digestion" of DNA refers to catalytic cleavage of the DNA with an enzyme that acts only 
at certain locations in the DNA. Such enzymes are called restriction endonucleases, and the sites for which each is 
specific is called a restriction site. The various restriction enzymes used herein are commercially available and their 

15 reaction conditions, cofactors, and other requirements as established by the enzyme suppliers are used. Restriction 
enzymes commonly are designated by abbreviations composed of a capital letter followed by other letters representing 
the microorganism from which each restriction enzyme originally was obtained, and then a number designating the 
particular enzyme. In general, about 1 \ig of plasmid or DNA fragment is used with about 1-2 units of enzyme in about 
20 |il of buffer solution. Appropriate buffers and substrate amounts for particular restriction enzymes are specified by 

20 the manufacturer. Incubation of about 1 hour at 37'*C is ordinarily used, but may vary in accordance with the supplier's 
instructions. After incubation, protein or polypeptide is removed by extraction with phenol and chlorofomri, and the 
digested nucleic acid is recovered from the aqueous fraction by precipitation with ethanol. Digestion with a restriction 
enzyme may be followed with bacterial alkaline phosphatase hydrolysis of the temiinal 5' phosphates to prevent the 
two restriction cleaved ends of a DNA fragment from "circularizing" or forming a closed loop that would impede insertion 

25 of another DNA fragment at the restriction site. Unless otherwise stated, digestion of plasmids is not followed by 5' 
terminal dephosphorylation. Procedures and reagents for dephosphorylation are conventional as described in sections 
1.56-1.61 of Sambrook et al., (Molecular Cloning: A Laboratory Manual New York; Cold Spring Harbor Laboratory 
Press, 1 989). 

[0060] "Ligation" refers to the process of forming phosphodiester bonds between two nucleic acid fragments. To 
30 ligate the DNA fragments together, the ends of the DNA fragments must be compatible with each other. In some cases, 
the ends will be directly compatible after endonuclease digestion. However, it may be necessary to first convert the 
staggered ends commonly produced after endonuclease digestion to blunt ends to make them compatible for ligation. 
To blunt the ends, the DNA is treated in a suitable buffer for at least 15 minutes at 15°C with about 10 units of the 
Klenow fragment of DNA polymerase I orT4 DNA polymerase in the presence of the fourdeoxyribonucleotidetriphos- 
35 phates. The DNA is then purified by phenol-chlorofomn extraction and ethanol precipitation. The DNA fragments that 
are to be ligated together are put in solution in about equimolar amounts. The solution will also contain ATP, ligase 
buffer, and a ligase such as T4 DNA ligase at about 1 0 units per 05 |ig of DNA. If the DNA is to be ligated into a vector, 
the vector is first linearized by digestion with the appropriate restriction endonuclease(s). The linearized fragment is 
then treated with bacterial alkaline phosphatase, orcalf intestinal phosphatase to prevent self-ligation during the ligation 
40 step. 

[0061] The technique of "polymerase chain reaction," or "PGR," as used herein generally refers to a procedure where- 
in minute amounts of a specific piece of nucleic acid, RNA and/or DNA, are amplified as described in U.S. Pat. No. 
4,683,195, issued 28 July 1987. Generally, sequence information from the erids of the region of interest onbeyond 
needs to be available, such that oligonucleotide primers can be designed; these primers will be identical or similar in 

45 sequence to opposite strands of the template to be amplified. The 5' terminal nucleotides of the two primers may 
coincide with the ends of the amplified material. PGR can be used to amplify specific RNA sequences, specific DNA 
sequences from total genomic DNA, and cDNA transcribed from total cellular RNA, bacteriophage or plasmid sequenc- 
es, etc. See generally Mullis etaL, Cold Spring Harbor Symp, Quant. Biol. 51 : 263 (1 987); Eriich, ed„ PCR Technology, 
(Stockton Press, NY, 1989). As used herein, PCR is considered to be one, but not the only, example of a nucleic acid 

50 polymerase reaction method for amplifying a nucleic acid test sample, comprising the use of a known nucleic acid 
(DNA or RNA) as a primer, and utilizes a nucleic acid polymerase to amplify or generate a specific piece of nucleic 
acid or to amplify or generate a specific piece of nucleic acid which is complementary to a particular nucleic acid. 
[0062] The "HRG tyrosine autophosphorylation assay" to detect the presence of HRG ligands was used to monitor 
the purification of a ligand for the pi B5^^^^ receptor. This assay is based on the assumption that a specific ligand for 

55 the pISSi^^i^^ receptor will stimulate autophosphorylation of the receptor, in analogy with EGF and its stimulation of 
EGF receptor autophosphorylation. MDA-MB-453 cells or MCF7 cells which contain high levels of p1 85^^^2 receptors 
but negligible levels of human EGF receptors, were obtained from the American Type Culture Collection, Rockville, 
Md. (ATCC No HTB-131) and maintained in tissue culture with 10% fetal calf serum in DMEM/Hams F12 (1:1) media. 
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For assay, the cells were trypsinized and plated at about 150,000 cells/well in 24 well dishes (Costar). After incubation 
with serum containing media overnight, the cells were placed in serum free media for 2-18 hours before assay. Test 
samples of 100 uL aliquots were added to each well. The cells were incubated for 5-30 minutes (typically 30 min) at 
37°C and the media removed. The cells in each well were treated with 100 uL SDS gel denaturing buffer (Seprosol, 

5 Enpotech, Inc.) and the plates heated at 1 0O^C for 5 minutes to dissolve the cells and denature the proteins. Aliquots 
from each well were electrophoresed on 5-20% gradient SDS gels (Novex, Encinitas. CA) according to the manufac- 
turer's directions. After the dye front reached the bottom of the gel, the electrophoresis was tenninated and a sheet of 
PVDF membrane (ProBlott, ABI) was placed on the gel and the proteins transferred from the gel to the membrane in 
a blotting chamber (BioRad) at 200 mAmps for 30-60 min. After blotting, the membranes were incubated with Tris 

10 buffered saline containing 0.1% Tween 20 detergent buffer with 5% BSAfor2-18 hrs to block nonspecific binding, and 
then treated with a mouse anti-phosphotyrosine antibody (Upstate Biological Inc., N.Y.). Subsequently, the membrane 
blots were treated with goat anti-mouse antibody conjugated to alkaline phosphatase. The gels were developed using 
the ProtoBlot System from Promega. After drying the membranes, the density of the bands corresponding to p]B5^^^^ 
in each sample iane was quantitated with a Hewlett Packard ScanJet Plus Scanner attached to a Macintosh computer. 

15 The number of receptors per cell in the MDA-M8-453 or MCF-7celis is such that under these experimental conditions 
the p185^^^R2 receptor protein is the major protein which is labeled. 

[0063] "Protein microsequencing" was accomplished based upon the following procedures. Proteins from the final 
HPLC step were either sequenced directly by automated Edman degradation with a model 470A Applied Biosystems 
gas phase sequencer equipped with a 120A PTH amino acid analyzer or sequenced after digestion with various chem- 

20 icals or enzymes. PTH amino acids were integrated using the Chrom Perfect data system (Justice Innovations, Palo 
Alto, CA). Sequence interpretation was perfonned on a VAX 11/785 Digital Equipment Corporation computer as de- 
scribed (Henzel et ai, J. Chromatography 404:41-52 (1987)). In some cases, aliquots of the HPLC fractions were 
electrophoresed on 5-20% SDS polyacrylamide gels, electrotransfen-ed to a PVDF membrane (ProBlott, ABI, Foster 
City, CA) and stained with Coomassie Brilliant Blue (Matsudaira, P., J. Bloi Chem. 262:10035-10038, 1987). The 

25 specific protein was excised from the blot for N temninal sequencing. To determine internal protein sequences, HPLC 
fractions were dried under vacuum (SpeedVac), resuspended in appropriate buffers, and digested with cyanogen bro- 
mide, the lysine-specific enzyme Lys-C (Wako Chemicals, Richmond, VA) or Asp-N (Boehringer Mannheim, Indiana- 
polis, Ind.), After digestion, the resultant peptides were sequenced as a mixture or were resolved by HPLC on a C4 
column developed with a propanol gradient in 0.1% TEA before sequencing as described above. 

30 

II. USE AND PREPARATION OF HRG POLYPEPTIDES 

1 . PREPARATION OF HRG POLYPEPTIDES INCLUDING VARIANTS 

35 [0064] The system to be employed in preparing HRG polypeptides will depend upon the particular HRG sequence 
selected. If the sequence is sufficiently small HRG is prepared by in vitro polypeptide synthetic methods. Most com- 
monly, however, HRG is prepared in recombinant cell culture using the host-vector systems described below. 
[0065] In general, mammalian host cells will be employed, and such hosts may or may not contain post-translational 
systems for processing HRG prosequences in the normal fashion. If the host cells contain such systems then it will be 

40 possible to recover natural subdomain fragments such as HRG-GFD OR HRG-NTD-GFD from the cultures. If not, then 
the proper processing can be accomplished by transforming the hosts with the required enzyme(s) or by cleaving the 
precursor in vitro . However, it is not necessary to transform cells with DNA encoding the complete prosequence for a 
selected HRG when it is desired to only produce fragments of HRG sequences such as an HRG-GFD. For example, 
to prepare HRG-GFD a start codon is ligated to the 5' end of DNA encoding an HRG-GFD, this DNA is used to transfomn 

45 host cells and the product expressed directly as the Met N-terminal form (if desired, the extraneous Met may be removed 
in vitro or by endogenous N-terminai demethionylases). Alternatively, HRG-GFD is expressed as a fusion with a signal 
sequence recognized by the host cell, which will process and secrete the mature HRG-GFD as is further described 
below. Amino acid sequence variants of native HRG-GFD sequences are produced in the same way 
[0066] HRG-NTD is produced in the same fashion as the full length molecule but from expression of DNA encoding 

50 only HRG-NTD, with the stop codon after one of S172-C182 (Fig. 15). 

[0067] In addition, HRG variants are expressed from DNA encoding protein in which both the GFD and NTD domains 
are in their proper orientation but which contain an amino acid insertion, deletion or substitution at the NTD-GFD joining 
site (for example located within the sequence SI 72 -CI 82. In another embodiment a stop codon is positioned at the 3' 
end of the NTD-GFD-encoding sequence (after any residue T/Q222-T245 of Fig. 15). The result is a soluble fonm of 

55 H RG-a or -p^ or -pg which lacks its transmembrane sequence (this sequence also may be an internal signal sequence 
but will be referred to as a transmembrane sequence). In further variations of this embodiment, an internal signal 
sequence of another polypeptide is substituted in place of the native HRG transmembrane domain, or a cytoplasmic 
domain of another cell membrane polypeptide, e.g. receptor kinase, is substituted for the HRG-a or HRG p^-pg cyto- 
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plasmic peptide. 

[0068] In a still further embodiment, the NTD, GFD and transmembrane domains of HRG and other EGF family 
members are substituted for one another, e.g. the NTD equivalent region of EGF is substituted for the NTD of HRG, 
or the GFD of HRG is substituted for EGF in the processed, soluble profonn of EGF. Alternatively, an HRG or EGF 
family member transmembrane domain is fused onto the C-terminal E236 of HRG-P3. 

[0069] In a further variant, the HRG sequence spanning K24'l to the C-terminus is fused at its N-temninus to the 0- 
terminus of a non-HRG polypeptide. 

[0070] Another embodiment comprises the functional or structural deletion of the proteolytic processing site in OTP, 
the GFD-transmembrane spanning domain. For example, the putative C-tenninal lysine (K241) of processed HRG-a 
or p.,-p2 is deleted, substituted with another residue, a residue other than K or R inserted between K241 and R242, or 
other disabling mutation is made in the prosequence. 

[0071] In another embodiment, the domain of any EGF family member extending from (a) Its cysteine corresponding 
to (b) G221 to the C-terminal residue of the family member is substituted for the analogous domain of HRG-a or -pi 
or -p2 (or fused to the C-temninus of HRG-P3). Such variants will be processed free of host cells in the same fashion 
as the family member rather than as the parental HRG. In more refined embodiments other specific cleavage sites (e. 
g. protease sites) are substituted into the CTP or GFD-transmembrane spanning domain (about residues T/Q222-T245, 
Fig. 15). For example, amphiregulin sequence E84-K99orTG Fa sequence E44-K58 is substituted for HRG-a residues 
E223-K241 . 

[0072] In a further embodiment, a variant (termed HRG-NTDxGFD) is prepared wherein (1 ) the lysine residue found 
in the NTD-GFD joining sequence VKC (residues 180-182, Figure 15) is deleted or (preferably) substituted by another 
residue other than R such as H, A, T or S and (2) a stop codon is introduced in the sequence RCT or RCQ (residues 
220-222, Figure 15) in place of C, or T (for HRG-a) or Q (for HRG-beta). 

[0073] A prefen-ed HRG-a ligand with binding affinity to pi es^^^i^^ comprises amino acids 226-265 of figure 4. This 
HRG-a ligand further may comprise up to an additional 1 -20 amino acids preceding amino acid 226 from figure 4 and 
1-20 amino acids following amino acid 265 from figure 4. A preferred HRG-p ligand with binding affinity to p185^^^^2 
comprises amino acids 225-265 of figure 8. This HRG-p ligand may comprise up to an additional 1-20 amino acids 
preceding amino acid 226 from figure 8 and 1-20 amino acids following amino acid 265 from figure 8. 
[0074] GFD sequences include those in which one or more residues corresponding to another member of the EGF 
family are deleted or substituted or have a residue inserted adjacent thereto. For example, F21 6 of HRG is substituted 
by Y, L202 with E, F189 with Y, or S203-P205 is deleted, 

[0075] HRG also includes NTD-GFD having its C-terminus at one of the first about 1 to 3 extracellular domain residues 
(QKR, residues 240-243, HRE-a, Figure 15) or first about 1-2 transmembrane region residues. In addition, in some 
HRG-GFD variants the codons are modified at the GFD-transmember proproteolysis site by substitution, insertion or 
deletion. The GFD proteolysis site is the domain that contains the GFD C-temninal residue and about 5 residues N- 
and 5 residues C-terminai from this residue. At this time neither the natural C-tenminal residue for HRG-a or HRG-p 
has been identified, although It Is known that Met-227temninal and \/al-229 tenninal HRG-a-GFD are biologically active. 
The native C-temninus for HRG-a-GFD is probably Met-227, Lys-228, Val-229, Gln-230, Asn-231 or Gln-232, and for 
HRG P1-P2-GFD is probably Met-226, Ala-227, Ser-228, Phe-229, Trp-230, Lys 231 or (for HRG-p^) K240 or (for HRG- 
P2) K246. The native C-terminus is determined readily by C-temninal sequencing, although it is not critical that HRG- 
GFD have the native temninus so long as the GFD sequence possesses the desired activity. In some embodiments of 
HRG-GFD variants, the amino acid change(s) in the CTP are screened for their ability to resist proteolysis in vitro and 
inhibit the protease responsible for generation of HRG-GFD. 

[0076] If it is desired to prepare the full length HRG polypeptides and the 5' or 3' ends of the given HRG are not 
described herein, It may be necessary to prepare nucleic acids in which the missing domains are supplied by homol- 
ogous regions from more complete HRG nucleic acids. Alternatively, the missing domains can be obtained by probing 
libraries using the DMAs disclosed in the Figures or fragments thereof. 

A. Isolation of DNA Encoding Heregulin 

[0077] The DNA encoding HRG may be obtained from any cDNA library prepared from tissue believed to possess 
HRG mRNA and to express it at a detectable level. HRG DNA also is obtained from a genomic library. 
[0078] Libraries are screened with probes or analytical tools designed to identify the gene of interest or the protein 
encoded by it. For cDNA expression libraries, suitable probes include monoclonal or polyclonal antibodies that recog- 
nize and specifically bind to HRG; oligonucleotides of about 20-80 bases in length that encode known or suspected 
portions of HRG cDNA from the same or different species; and/or complementary or homologous cDNAs or fragments 
thereof that encode the same or a hydridizing gene. Appropriate probes for screening genomic DNA libraries include, 
but are not limited to, oligonucleotides; cDNAs or fragments thereof that encode the same or hybridizing DNA; and/or 
homologous genomic DNAs or fragments thereof. Screening the cDNA or genomic library with the selected probe may 
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be conducted using standard procedures as described in chapters 10-12 of Sambrook et al., supra. 
[0079] An alternative means to isolate the gene encoding HRG is to use polymerase chain reaction (PGR) method- 
ology as described in section 14 of Sambrook ef a/., supra. This method requires the use of oligonucleotide probes 
that will hybridize to HRG. Strategies for selection of oligonucleotides are described below. 

5 [0080] Another alternative method for obtaining the gene of interest is to chemically synthesize it using one of the 
methods described in Engels etal. (Agnew. Chem. inl Ed. Engi, 28: 716-734,1989). These methods include triester, 
phosphite, phosphoramidite and H-Phosphonate methods, PGR and other autoprimer methods, and oligonucleotide 
syntheses on solid supports. These methods may be used if the entire nucleic acid sequence of the gene is known, 
or the sequence of the nucleic acid complementary to the coding strand is available, or alternatively, if the target amino 

10 acid sequence is known, one may infer potential nucleic acid sequences using known and preferred coding residues 
for each amino acid residue. 

[0081] A preferred method of practicing this invention is to use carefully selected oligonucleotide sequences to screen 
cDNA libraries from various tissues, preferably human breast, colon, salivary gland, placental, fetal, brain, and carci- 
noma ceil lines. Other biological sources of DNA encoding an heregulin-like ligand include other mammals and birds. 

15 Among the preferred mammals are members of the following orders: bovine, ovine, equine, murine, and rodentia. 
[0082] The oligonucleotide sequences selected as probes should be of sufficient length and sufficiently unambiguous 
that false positives are minimized. The actual nucleotide sequence(s) is usually based on consen/ed or highly homol- 
ogous nucleotide sequences or regions of HRG-a. The oligonucleotides may be degenerate at one or more positions. 
The use of degenerate oligonucleotides may be of particular importance where a library is screened from a species in 

20 which preferential codon usage in that species is not known. The oligonucleotide must be labeled such that it can be 
detected upon hybridization to DNA in the library being screened. The preferred method of labeling is to use 32p-iabeled 
ATP with polynucleotide kinase, as is well known in the art, to radiolabel the oligonucleotide. However, other methods 
may be used to label the oligonucleotide, including, but not limited to, biotinylation or enzyme labeling. 
[0083] Of particular interest is HRG nucleic acid that encodes the full-length propolypeptide. In some preferred em- 

25 bodiments, the nucleic acid sequence includes the native HRG signal transmembrane sequence. Nucleic acid having 
all the protein coding sequence is obtained by screening selected cDNA or genomic libraries, and, if necessary, using 
conventional primer extension procedures as described in section 7.79 of Sambrook ei aL, supra, to detect precursors 
and processing intermediates of mRNA that may not have been reverse-transcribed into cDNA. 
[0084] HRG encoding DNA is used to isolate DNA encoding the analogous ligand from other animal species via 

30 hybridization employing the methods discussed above. The preferred animals are mammals, particulariy bovine, ovine, 
equine, feline, canine and rodentia, and more specifically rats, mice and rabbits. 

B. Amino Acid Sequence Variants of Heregulin 

35 [0085] Amino acid sequence variants of HRG are prepared by introducing appropriate nucleotide changes into HRG 
DNA, or by in vitro synthesis of the desired HRG polypeptide. Such variants include, for example, deletions from, or 
insertions or substitutions of, residues within the amino acid sequence shown for human HRG sequences. Any com- 
bination of deletion, insertion, and substitution can be made to arrive at the final construct, provided that the final 
construct possesses the desired characteristics. The amino acid changes also may alter post-translational processes 

40 of HRG-a, such as changing the number or position of glycosylation sites, altering the membrane anchoring charac- 
teristics, altering the intra-cellular location of HRG by inserting, deleting, or otherwise affecting the transmembrane 
sequence of native HRG, or modifying its susceptibility to proteolytic cleavage. 

[0086] In designing amino acid sequence variants of HRG, the location of the mutation site and the nature of the 
mutation will depend on HRG characteristic(s) to be modified. The sites for mutation can be modified individually or in 
45 series, e.g., by (1) substituting first with conservative amino acid choices and then with more radical selections de- 
pending upon the results achieved, (2) deleting the target residue, or (3) inserting residues of other ligands adjacent 
to the located site. 

[0087] A useful method for identification of HRG residues or regions for mutagenesis is called "alanine scanning 
mutagenesis" as described by Gunningham and Wells {Science, 244:1081-1085, 1989). Here, a residue or group of 

so target residues are identified (e.g., charged residues such as arg, asp, his, lys, and g!u) and replaced by a neutral or 
negatively charged amino acid (most preferably alanine or polyalanine) to affect the interaction of the amino acids with 
the surrounding aqueous environment in or outside the cell. Those domains demonstrating functional sensitivity to the 
substitutions then are refined by introducing further or other variants at or for the sites of substitution. Thus, while the 
site for introducing an amino acid sequence variation is predetermined, the nature of the mutation perse need not be 

55 predetermined. For example, to optimize the performance of a mutation at a given site, ala scanning or random mu- 
tagenesis may be conducted at the target codon or region and the expressed HRG variants are screened for the optimal 
combination of desired activity. 

[0088] There are two principal variables in the construction of amino acid sequence variants: the location of the 
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mutation site and the nature of the mutation. These are variants from HRG sequence, and may represent naturally 
occurring alleles (which wilt not require manipulation of HRG DNA) or predetemnined mutant fonns made by mutating 
the DNA, either to arrive at an allele or a variant not found in nature. In general, the location and nature of the mutation 
chosen will depend upon HRG characteristic to be modified. Obviously, such variations that, for example, convert HRG 
5 into a known receptor ligand, are not included within the scope of this invention, nor are any other HRG variants or 
polypeptide sequences that are not novel and unobvious over the prior art. 

[0089] Amino acid sequence deletions generally range from about 1 to 30 residues, more preferably about 1 to 10 
residues, and typically about 1 to 5 contiguous residues. Deletions may be introduced into regions of low homology 
with other EGF family precursors to modify the activity of HRG. Deletions from HRG in areas of substantial homology 
10 with other EGF family sequences will be more likely to modify the biological activity of HRG more significantly. The 
number of consecutive deletions will be selected so as to preserve the tertiary structure of HRG In the affected domain, 
e.g., cysteine crossllnking, beta-pleated sheet or alpha helix. 

[0090] Amino add sequence insertions include amino- and/or carboxyl-tenninal fusions ranging in length from one 
residue to polypeptides containing a hundred or more residues, as well as Intrasequence insertions of single or multiple 

15 amino acid residues. Intrasequence insertions (I.e., insertions within HRG sequence) may range generally from about 
1 to 10 residues, more preferably 1 to 5, and most preferably 1 to 3. Examples of terminal insertions include HRG with 
an N-tenninal methionyl residue (an artifact of the direct expression of HRG in bacterial recombinant cell culture), and 
fusion of a heterologous N-tenminal signal sequence to the N-tenninus of HRG to facilitate the secretion of mature 
HRG from recombinant host cells. Such signal sequences generally will be obtained from, and thus be homologous 

20 to, the Intended host cell species. Suitable sequences include STII or Ipp for E. coli, alpha factor for yeast, and viral 
signals such as herpes gD for mammalian cells. 

[0091] Other insertional variants of HRG include the fusion to the N- or C-terminus of HRG to an immunogenic 
polypeptide, e.g., bacterial polypeptides such as beta-lactamase or an enzyme encoded by the E. coli Up locus, or 
yeast protein, bovine serum albumin, and chemotactic polypeptides. C-termina! fusions of H RG-NTD-GFD with proteins 
25 having a long half-life such as immunoglobulin constant regions (or other immunoglobulin regions), albumin, or ferritin, 
as described in WO 89/02922, published 6 April 1989 are included. 

[0092] Another group of variants are amino acid substitution variants. These variants have at least one amino acid 
residue in the HRG molecule removed and a different residue inserted in its place. The sites of greatest interest for 
substitutional mutagenesis include sites identified as the active site(s) of HRG, and sites where the amino acids found 
30 in HRG ligands from various species are substantially different in terms of side-chain bulk, charge, and/or hydropho- 
blcity. 

[0093] The amino terminus of the cytoplasmic region of HRG may be fused to the carboxy terminus of heterologous 
transmembrane domains and receptors, to fomn a fusion polypeptide useful for intracellular signaling of a ligand binding 
to the heterologous receptor. 

35 [0094] Other sites of interest are those in which particular residues of HRG-Iike ligands obtained from various species 
are Identical. These positions may be Important for the biological activity of HRG. These sites, especially those falling 
within a sequence of at least three other identically conserved sites, are substituted In a relatively conservative manner. 
Such conservative substitutions are shown In Table 1 underthe heading of "preferred substitutions". If such substitutions 
result in a change in biological activity, then more substantial changes, denominated exemplary substitutions In Table 

40 1 , or as further described below In reference to amino acid classes, are introduced and the products screened. 



TABLE 1 





Original Residue 


Exemplary Substitutions 


Preferred Substitutions 


45 


Ala (A) 


val; leu; ile 


val 




Arg (R) 


tys; gin; asn 


tys 




Asn (N) 


gin; his; tys; arg 


gin 




Asp (D) 


glu 


glu 




Cys (C) 


ser 


ser 


50 


Gin (Q) 


asn 


asn 




Glu (E) 


asp 


asp 




Gly (G) 


pro 


pro 




His (H) 


asn; gin; lys; arg 


arg 


55 


lie (1) 


leu; val; met; ala; phe; norleucine 


leu 




Leu (L) 


nor!eucine;ile;val; met; ala; phe 


ile 




Lys (K) 


arg;gln;asn 


arg 
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TABLE 1 (continued) 





Original Residue 


Exemplary Substitutions 


Preferred Substitutions 




Met (M) 


leu;phe;ile 


leu 


5 


Phe (F) 


ieu; val; ile; ala 


leu 




Pro (P) 


gly 


gly 




Ser (S) 


thr 


thr 




Thr (T) 


ser 


ser 




Trp (W) 


tyr 


tyr 


10 


Tyr (Y) 


trp; phe; thr; ser 


phe 




Val (V) 


ile;leu;met;phe; ala; norleucine 


leu 



[0095] Substantial modifications in function or immunological identity of HRG are accomplished by selecting substi- 
tutions that differ significantly in their effect on maintaining 



(a) the structure of the polypeptide backbone in the area of the substitution, for exannpie, as a sheet or helical 
conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. 
Naturally occurring residues are divided into groups based on common side chain properties: 



20 



1) hydrophobic: norleucine, met, ala, val, leu, ile; 

2) neutral hydrophilic: cys, ser, thr; 

3) acidic: asp, glu; 

4) basic: asn, gin, his, lys, arg; 

25 5) residues that influence chain orientation: gly, pro; and 

6) aromatic: trp, tyr, phe. 

[0096] Non-conservative substitutions will entail exchanging a member of one of these classes for another. Such 
substituted residues may be introduced into regions of HRG that are homologous with other receptor iigands, or, more 

30 preferably, into the non-homologous regions of the molecule. 

[0097] In one embodiment of the invention, it is desirable to inactivate one or more protease cleavage sites that are 
present in the molecule. These sites are identified by inspection of the encoded amino acid sequence. Where potential 
protease cleavage sites are identified, e.g. at K241 R242, they are rendered inactive to proteolytic cleavage by sub- 
stituting the targeted residue with another residue, preferably a basic residue such as glutamine or a hydrophylic residue 

35 such as serine; by deleting the residue; or by inserting a prolyl residue immediately after the residue. 

[0098] In another embodiment, any methionyl residue other than the starting methionyl residue, or any residue lo- 
cated within about three residues N- or G-tenninal to each such methionyl residue, is substituted by another residue 
(preferably in accord with Table 1) or deleted. We have found that oxidation of the 2 GFD M residues in the courses 
of E. CO// expression appears to severely reduce GFD activity. Thus, these M residues are mutated in accord with Table 

40 1 . Alternatively, about 1 -3 residues are inserted adjacent to such sites. 

[0099] Any cysteine residues not involved in maintaining the proper confomnation of HRG also may be substituted, 
generally with serine, to improve the oxidative stability of the molecule and prevent aberrant crosslinking. 
[0100] Sites particularly suited for substitutions, deletions or insertions, or use as fragments, include, numbered from 
the N-temninus of HRG-a of Figure 4: 

45 

1) potential glycosaminoglycan addition sites at the serine-glycine dipeptides at 42-43, 64-65, 151-152; 

2) potential asparagine-linked glycosylation at positions 164, 170, 208 and 437, sites (NDS) 164-166, (NJT) 
170-172, (NTS) 208-210, and NTS (609-611); 

3) potential 0-glycosylation in a cluster of serine and threonine at 209-218; 
50 4) cysteines at 226, 234, 240, 254, 256 and 265; 

5) transmembrane domain at 287-309; 

6) loop 1 delineated by cysteines 226 and 240; 

7) loop 2 delineated by cysteines 234 and 254; 

8) loop 3 delineated by cysteines 256 and 265; and 

55 9) potential protease processing sites at 2-3, 8-9, 23-24, 33-34, 36-37, 45-46, 48-49, 62-63, 66-67, 86-87, 110-111, 

123-124, 134-135, 142-143, 272-273, 278-279 and 285-286; 

[0101] Analogous regions in HRG-p1 may be determined by reference to figure 9 which aligns analogous amino 
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acids in HRG-a and HRG-p1 . The analogous HRG-p1 annino acids may be nnutated or modified as discussed above 
for HRG-a. Analogous regions In HRG-p2 may be determined by reference to figure 1 5 which aligns analogous amino 
acids in HRG-a, HRG-p1 and H RG-p2. The analogous HRG-p2 amino acids may be mutated or modified as discussed 
above for HRG-a or HRG-p1 . Analogous regions in HRG-p3 may be determined by reference to figure 1 5 which aligns 
5 analogous amino acids in HRG-a, HRG-p1 and HRG-p2. The analogous HRG-p3 amino acids may be mutated or 
modified as discussed above for HRG-a, HRG-pi , or HRG-p2. 

[0102] DNA encoding amino acid sequence variants of HRG is prepared by a variety of methods known in the art. 
These methods include, but are not limited to, isolation from a natural source (in the case of naturally occurring amino 
acid sequence variants) or preparation by oligonucleotide-mediated (or site-directed) mutagenesis, PGR mutagenesis, 
10 and cassette mutagenesis of an earlier prepared variant or a non-variant version of HRG. These techniques may utilize 
HRG nucleic acid (DNA or RNA), or nucleic acid complementary to HRG nucleic acid. 

[0103] Oligonucleotide-mediated mutagenesis is a preferred method for preparing substitution, deletion, and inser- 
tion variants of HRG DNA. This technique is well known in the art as described by Adelman etaL, DNA, 2:183 (1983). 
[0104] Generally, oligonucleotides of at least 25 nucleotides in length are used. An optimal oligonucleotide will have 
15 12 to 15 nucleotides that are completely complementary to the template on either side of the nucleotide(s) coding for 
the mutation. This ensures that the oligonucleotide will hybridize property to the single-stranded DNA template mole- 
cule. The oligonucleotides are readily synthesized using techniques known in the art such as that described by Crea 
etal. {Proc. Natl. Acad. Sci USA, 75:5765,1978). 

[01 05] Single-stranded DNA template may also be generated by denaturing double-stranded plasmid (or other) DNA 

20 using standard techniques. 

[0106] For alteration of the native DNA sequence (to generate amino acid sequence variants, for example), the 
oligonucleotide is hybridized to the single-stranded template under suitable hybridization conditions. A DNA polymer- 
izing enzyme, usually the Klenow fragment of DNA polymerase I, is then added to synthesize the complementary 
strand of the template using the oligonucleotide as a primer for synthesis. A heteroduplex molecule is thus formed 

25 such that one strand of DNA encodes the mutated form of HRG, and the other strand (the original template) encodes 
the native, unaltered sequence of HRG. This heteroduplex molecule is then transformed into a suitable host cell, usually 
a prokaryote such as E. co// JM1 01 . After the cells are grown, they are plated onto agarose plates and screened using 
the oligonucleotide primer radiolabeled with ^P-phosphate to identify the bacterial colonies that contain the mutated 
DNA. The mutated region is then removed and placed in an appropriate vector for protein production, generally an 

30 expression vector of the type typically employed for transfonmation of an appropriate host. 

[0107] The method described immediately above may be modified such that a homoduplex molecule is created 
wherein both strands of the plasmid contain the mutation(s). The modifications are as follows: the single-stranded 
oligonucleotide is annealed to the single-stranded template as described above. A mixture of three deoxyribonucle- 
otides, deoxyriboadenosine (dATP), deoxyriboguanosine (dGTP), and deoxyribothymidine (dTTP), is combined with 

35 a modified thio-deoxyribocytosine called dCTP-(aS) (Amersham Corporation). This mixture is added to the temptate- 
oligonucleotide complex. Upon addition of DNA polymerase to this mixture, a strand of DNA identical to the template 
except for the mutated bases is generated. In addition, this new strand of DNA will contain dCTP-(aS) instead of dCTP, 
which serves to protect it from restriction endonuclease digestion. After the template strand of the double-stranded 
heteroduplex is nicked with an appropriate restriction enzyme, the template strand can be digested with Exo Hl nuclease 

40 or another appropriate nuclease past the region that contains the site(s) to be mutagenized. The reaction is then 
stopped to leave a molecule that is only partially single-stranded. A complete double-stranded DNA homoduplex is 
then formed using DNA polymerase in the presence of all four deoxy ribonucleotide triphosphates, ATP, and DNAtigase. 
This homoduplex molecule can then be transformed into a suitable host cell such as E. co// 101 , as described above. 
[0108] Expianary substitutions common to any HRG include S2T or D; BSD or K; R4 K or E; K5R or E; E6D or K; 

45 G7P or Y; R8K or D; G9P or Y; K1 OR or E; G1 1 P or Y; K12R or E; G1 9P or Y; S20T or F; G21 P or Y; K22 or E; K23R 
or E; Q38D; S107N; G108P; N120K; D121K; S122T N126S; I126L; T127S; A163V; N164K: T165-T174; any residue 
to I, L, V, M, F, D, E, R or K; G1 75V or P; T1 76S or V; S1 77K or T; H1 78K or S; L1 79F or I; VI SOL or S; K1 81 R or E; 
A 1 83N or V; E1 84K or D; K1 85R or E; E1 86D or Y; K1 87R or D; T1 88S or Q; F1 89Y or S; VI 91 L or D; N1 92Q or H; 
G193P or A; G194P or A; E195D or K; F197Y or I; M 198V or Y; V199L or T; K200V or R; D201E or K; L202E or K; 

50 S203A or T; N204A; N204Q; P205A; P205G; S206T or R; R207K or A; Y206P or F; L209I or D; K21 1 1 or D; F21 6Y or 
I; T217 H or S; G218A or P; A/D219K or R; R220K or A; A235/240/232V or F; E236/241/233D or K; E237/242/234D 
or K; L238/243/235I or T; Y239/244/236F or T; Q240/245/237N or K; K24iy246/238H or R; R242/247/238H or K; 
V243/248/239L orT; L244/249/240I orS; T245/250/241S or I; I246/251/242V or T and T247/252/243S or 1. Specifically 
with respect to HRG-a, T222S, K or V; E223D. R or Q; N224Q, K or F; V225A, R or D; P226G, I K or F; M227V, X R 

55 orY; K228R. H or D; V229L, K or D; Q230N, R orY;N231Q, K or Y; Q232N, RorY; E233D, K orT and K234R, H or 
D (adjacent K/R mutations are paired in alternative embodiments to create new proteolysis sites). Specifically with 
respect to HRG-p (any member). Q222N, R or Y; N223Q, K or Y; Y224F, T or R; V225A, K or D; M226V, T or R; A227V, 
K, Y or D; S22BT, Y or R; F229Y, I or K and Y230F, T or R are suitable variants. Specifically with respect to HRG-pi , 
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K231 R or D. H232R or D; L2331, K, F or Y; G234P, R, A or S; 1 2351 , K, F or Y; E236D. R or A; F237I, Y, K or A; M238V. 
T, R or A and E239D, R or A are suitable variants. Specifically with respect to HRG-p^ and HRG-p2. K231 R or D are 
suitable variants. Alternatively, each of these residues may be deleted or the indicated substituents inserted adjacent 
thereto. In addition, about from 1 -1 0 variants are connbined to produce connbinations. These changes are nnade in the 
5 proHRG, NTD, GFD, NTD-GFD or other fragments or fusions. Q213-G215, A219 and the about 11-21 residues C- 
terminal to C221 differ among the various HRG classes. Residues at these are interchanged among HRG classes or 
EGF family members, are deleted, or a residue inserted adjacent thereto. 

[0109] DNA encoding HRG-a mutants with more than one amino acid to be substituted may be generated in one of 
several ways. If the amino acids are located close together in the polypeptide chain, they may be mutated simultane- 
10 ously using one oligonucleotide that codes for all of the desired amino acid substitutions. If, however, the amino acids 
are located some distance from each other (separated by more than about ten amino acids), it is more difficult to 
generate a single oligonucleotide that encodes all of the desired changes. Instead, one of two alternative methods may 
be employed. 

[01 10] PGR mutagenesis is also suitable for making amino acid variants of HRG-a. While the following discussion 
15 refers to DNA, it is understood that the technique also finds application with RNA. The PGR technique generally refers 
to the following procedure (see Eriich, supra, the chapter by R. Higuchi, p. 61 -70). When small amounts of template 
DNA are used as starting material in a PGR, primers that differ slightly in sequence from the con-esponding region in 
a template DNA can be used to generate relatively large quantities of a specific DNA fragment that differs from the 
template sequence only at the positions where the primers differ from the template. For introduction of a mutation into 
20 a plasmid DNA, one of the primers is designed to overlap the position of the mutation and to contain the mutation; the 
sequence of the other primer must be identical to a stretch of sequence of the opposite strand of the plasmid, but this 
sequence can be located anywhere along the plasmid DNA. It is preferred, however, that the sequence of the second 
primer is located within 200 nucleotides from that of the first, such that in the end the entire amplified region of DNA 
bounded by the primers can be easily sequenced. PGR amplification using a primer pair like the one just described 
25 results in a population of DNA fragments that differ at the position of the mutation specified by the primer, and possibly 
at other positions, as template copying is somewhat error-prone. 

[0111] If the ratio of template to product material is extremely low, the vast majority of product DNA fragments incor- 
porate the desired mutation(s). This product material is used to replace the corresponding region in the plasmid that 
served as PGR template using standard DNA technology. Mutations at separate positions can be introduced simulta- 

30 neously by either using a mutant second primer, or perf orniing a second PGR with different mutant primers and ligattng 
the two resulting PGR fragments simultaneously to the vector fragment in a three (or more)-part ligation. 
[0112] Another method for preparing variants, cassette mutagenesis, is based on the technique described by Wells 
et al. (Gene, 34: 31 5,1 985). The starting material is the plasmid (or other vector) comprising HRG DNA to be mutated. 
The codon(s) in HRG DNA to be mutated are identified. There must be a unique restriction endonuclease site on each 

35 side of the identified mutation site(s). If no such restriction sites exist, they may be generated using the above-described 
oligonucleotide-mediated mutagenesis method to introduce them at appropriate locations in HRG DNA. After the re- 
striction sites have been introduced into the plasmid, the plasmid is cut at these sites to linearize it. A double-stranded 
oligonucleotide encoding the sequence of the DNA between the restriction sites but containing the desired mutation 
(s) is synthesized using standard procedures. The two strands are synthesized separately and then hybridized together 

40 using standard techniques. This double-stranded oligonucleotide is referred to as the cassette. This cassette is de- 
signed to have 3' and 5' ends that are compatible with the ends of the linearized plasmid, such that it can be directly 
ligated to the plasmid. This plasmid now contains the mutated HRG DNA sequence. 

C. Insertion of DNA into a Cloning or Expression Vehicle 

45 

[0113] The cDNA or genomic DNA encoding native or variant HRG is inserted into a replicable vector for further 
cloning (amplification of the DNA) or for expression. Many vectors are available, and selection of the appropriate vector 
will depend on 1) whether it is to be used for DNA amplification or for DNA expression, 2) the size of the DNA to be 
inserted into the vector, and 3) the host cell to be transformed with the vector. Each vector contains various components 
50 depending on its function (amplification of DNA or expression of DNA) and the host cell for which it is compatible. The 
vector components generally include, but are not limited to, one or more of the following: a signal sequence, an origin 
of replication, one or more marker genes, an enhancer element, a promoter, and a transcription termination sequence. 

(i) Signal Sequence Component 

55 

[0114] In general, the signal sequence may be a component of the vector, or it may be a part of HRG DNA that is 
inserted into the vector. The native HRG DNA is believed to encode a signal sequence at the amino temninus (5' end 
of the DNA encoding HRG) of the polypeptide that is cleaved during post-translational processing of the polypeptide 
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to form the mature HRG polypeptide ligand that binds to p1Q5^^^^ receptor, although a conventional signal structure 
is not apparent. Native proHRG is. secreted from the cell but may remain lodged in the membrane because it contains 
a transmembrane domain and a cytoplasmic region in the carboxyl terminal region of the polypeptide. Thus, in a se- 
creted, soluble version of HRG the carboxyl terminal domain of the molecule, including the transmembrane domain, 
5 is ordinarily deleted. This truncated variant HRG polypeptide may be secreted from the cell, provided that the DNA 
encoding the tmncated variant encodes a signal sequence recognized by the host. 

[01 1 5] HRG of this invention may be expressed not only directly, but also as a fusion with a heterologous polypeptide, 
preferably a signal sequence or other polypeptide having a specific cleavage site at the N-and/or C-temninis of the 
mature protein or polypeptide. In general, the signal sequence may be a component of the vector, or it may be a part 

10 of HRG DNA that is inserted into the vector. Included within the scope of this invention are HRG with the native signal 
sequence deleted and replaced with a heterologous signal sequence. The heterologous signal sequence selected 
should be one that is recognized and processed, i.e., cleaved by a signal peptidase, by the host cell. For prokaryotic 
host cells that do not recognize and process the native HRG signal sequence, the signal sequence is substituted by 
a prokaryotic signal sequence selected, for example, from the group of the alkaline phosphatase, penicillinase, Ipp, or 

15 heat -stable enterotoxin II leaders. For yeast secretion the native HRG signal sequence may be substituted by the yeast 
invertase, alpha factor, or acid phosphatase leaders. In mammalian cell expression the native signal sequence is 
satisfactory, although other mammalian signal sequences may be suitable. 

(ii) Origin of Replication Component 

20 

[0116] Both expression and cloning vectors generally contain a nucleic acid sequence that enables the vector to 
replicate in one or more selected host cells. Generally, in cloning vectors this sequence Is one that enables the vector 
to replicate independently of the host chromosomal DNA, and includes origins of replication or autonomously replicating 
sequences. Such sequences are well known for a variety of bacteria, yeast, and viruses. The origin of replication from 
25 the plasmid pBR322 is suitable for most Gram-negative bacteria, the2^plasmid origin is suitable for yeast, and various 
viral origins (SV40, polyoma, adenovirus, VSV or BPV) are useful for cloning vectors in mammalian cells. Generally 
the origin of replication component is not needed for mammalian expression vectors {the SV40 origin may typically be 
used only because it contains the early promoter). 

[0117] Most expression vectors are "shuttle" vectors, Le., they are capable of replication in at least one class of 
30 organisms but can be transfected into another organism for expression. For example, a vector is cloned in E. coli and 
then the same vector is transfected into yeast or mammalian cells for expression even though it is not capable of 
replicating independently of the host cell chromosome. 

[0118] DNA may also be amplified by insertion into the host genome. This is readily accomplished using Bacillus 
species as hosts, for example, by including in the vector a DNA sequence that is complementary to a sequence found 
35 in 6ac///Lfs genomic DNA. Transfection of Sac/Z/t/s with this vector results in homologous recombination with the genome 
and insertion of HRG DNA. However, the recovery of genomic DNA encoding HRG is more complex than that of an 
exogenously replicated vector because restriction enzyme digestion is required to excise HRG DNA. DNA can be 
amplified by PGR and directly transfected into the host cells without any replication component 

40 (iii) Selection Gene Component 

[01 19] Expression and cloning vectors should contain a selection gene, also termed a selectable marker. This gene 
encodes a protein necessary for the survival or growth of transformed host cells grown in a selective culture medium. 
Host cells not transformed with the vector containing the selection gene will not survive in the culture medium. Typical 
45 selection genes encode proteins that (a) confer resistance to antibiotics or other toxins, e.g., ampicillin, neomycin, 
methotrexate, or tetracycline, (b) complement auxotrophic deficiencies, or (c) supply critical nutrients not available 
from complex media, e.g., the gene encoding D-alantne racemase for Bacilli, 

[0120] One example of a selection scheme utilizes a dmg to arrest growth of a host cell. Those cells that are suc- 
cessfully transformed with a heterologous gene express a protein conferring drug resistance and thus survive the 
50 selection regimen. Examples of such dominant selection use the drugs neomycin (Southern et ai, J. Molec. Appi 
Genet. 1: 327,1982), mycophenolic acid (Mulligan eiai, Science2Q9: 1422, 1980) or hygromycin (Sugden eta!., Mol. 
Cell. Biol. 5: 410-413,1985). The three examples given above employ bacterial genes under eukaryotic control to 
convey resistance to the appropriate drug G41B or neomycin (geneticin), xgpt {mycophenolic acid), or hygromycin, 
respectively. 

55 [0121] Another example of suitable selectable markers for mammalian cells are those that enable the identification 
of cells competent to take up HRG nucleic acid, such as dihydrofolate reductase (DHFR) or thymidine kinase. The 
mammalian cell transformants are placed under selection pressure which only the transformants are uniquely adapted 
to survive by virtue of having taken up the marker. Selection pressure is imposed by culturing the transformants under 
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conditions in which the concentration of selection agent in the nnedium is successively changed, thereby leading to 
amplification of both the selection gene and the DNA that encodes HRG. Amplification is the process by which genes 
in greater demand for the production of a protein critical for growth are reiterated in tandem within the chromosomes 
of successive generations of recombinant cells. Increased quantities of HRG are synthesized from the amplified DNA. 

5 [01 22] For example, cells transfomned with the DHFR selection gene are first identified by culturing all of the trans- 
formants in a culture medium that contains methotrexate (Mtx), a competitive antagonist of DHFR. An appropriate host 
cell when wild-type DHFR is employed is the Chinese hamster ovary (CHO) cell line deficient in DHFR activity, prepared 
and propagated as described by Urlaub and Chasin, Proc. Nati Acad. Sci USA, 77: 4216, 1980. The transformed 
cells are then exposed to increased levels of methotrexate. This leads to the synthesis of multiple copies of the DHFR 

10 gene, and, concomitantly, multiple copies of other DNA comprising the expression vectors, such as the DNA encoding 
HRG. This amplification technique can be used with any otherwise suitable host, e.g., ATCG No. CCL61 CH0-K1, 
notwithstanding the presence of endogenous DHFR if, for example, a mutant DHFR gene that is highly resistant to 
Mtx is employed (EP 117,060). Alternatively, host cells (particularly wild-type hosts that contain endogenous DHFR) 
transfomned or co-transfomned with DNA sequences encoding HRG, wild-type DHFR protein, and another selectable 

'5 marker such as aminoglycoside 3' phosphotransferase (APH) can be selected by cell growth in medium containing a 
selection agent for the selectable marker such as an aminoglycosidic antibiotic, e.g., kanamycin, neomycin, orG41B 
(see U.S. Pat. No. 4,965,199). 

[0123] A suitable selection gene for use in yeast is the trp^ gene present in the yeast plasmid YRp7 (Stinchcomb et 
ai, Nature, 282: 39, 1979; Kingsman et ai, Gene, 7: 141, 1979; orTschemper etal., Gene, 10: 157, 1980). The frpi 
20 gene provides a selection marker for a mutant strain of yeast lacking the ability to grow in tryptophan, for example, 
ATCG No. 44076 or PEP4-1 (Jones, Genetics, 85: 12, 1977). The presence of the frpi lesion in the yeast host cell 
genome then provides an effective environment for detecting transformation by growth in the absence of tryptophan. 
Similarly, Leu2-deficient yeast strains (ATCG 20,622 or 38,626) are complemented by known plasmids bearing the 
LeUZ gene. 

25 

(iv) Promoter Component 

[0124] Expression and cloning vectors usually contain a promoter that is recognized by the host organism and is 
operably linked to HRG nucleic acid. Promoters are untranslated sequences located upstream (5') to the start codon 

30 of a structural gene (generally within about 1 00 to 1 000 bp) that control the transcription and translation of a particular 
nucleic acid sequence, such as HRG to which they are operably linked. Such promoters typically fall into two classes, 
inducible and constitutive. Inducible promoters are promoters that initiate increased levels of transcription from DNA 
under their control in response to some change in culture conditions, e.g., the presence or absence of a nutrient or a 
change in temperature. At this time a large number of promoters recognized by a variety of potential host cells are well 

35 known. These promoters are operably linked to DNA encoding HRG by removingthe promoter from the source DNA 
by restriction enzyme digestion and inserting the isolated promoter sequence into the vector. Both the native HRG 
promoter sequence and many heterologous promoters may be used to direct amplification and/or expression of HRG 
DNA. However, heterologous promoters are preferred, as they generally pemnit greater transcription and higher yields 
of expressed HRG as compared to the native HRG promoter 

40 [0125] Promoters suitable for use with prokaryotic hosts include the p-lactamase and lactose promoter systems 
(Chang etal., Nature, 275: 615, 1978; and Goeddel etal., Wafure 281 : 544, 1979), alkaline phosphatase, a tryptophan 
(trp) promoter system (Goeddel, Nucleic Acids Res., 8: 4057, 1980 and EP 36,776) and hybrid promoters such as the 
tac promoter (deBoer et ai, Proc. NatL Acad. Sci. USA 80: 21-25, 1983). However, other known bacterial promoters 
are suitable. Their nucleotide sequences have been published, thereby enabling a skilled worker operably to ligate 

45 them to DNA encoding HRG (Siebenlist et at., Cell 20: 269, 1980) using linkers or adaptors to supply any required 
restriction sites. Promoters for use in bacterial systems also generally will contain a Shine-Datgamo (S.D.) sequence 
operably linked to the DNA encoding HRG. 

[0126] Suitable promoting sequences for use with yeast hosts include the promoters for 3-phosphoglycerate kinase 
(Hitzeman etal., J. Biol. Chem., 255: 2073. 1 980) or other glycolytic enzymes (Hess etal, J. Adv. Enzyme Reg7: 149, 

50 1968; and Holland, Biochemistry 17: 4900, 1978), such as enolase, glyceraldehyde-3-phosphate dehydrogenase, hex- 
okinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3-phosphoglycerate mu- 
tase, pyruvate kinase, triosephosphate isomerase, phosphoglucose isomerase, and glucokinase. 
[0127] Other yeast promoters, which are inducible promoters having the additional advantage of transcription con- 
trolled by growth conditions, are the promoter regions for alcohol dehydrogenase 2, isocytochrome C, acid phos- 

55 phatase, degradative enzymes associated with nitrogen metabolism, metallothionein, glyceratdehyde-3-phosphate de- 
hydrogenase, and enzymes responsible for maltose and galactose utilization. Suitable vectors and promoters for use 
in yeast expression are further described in Hitzeman etai, EP 73,657A. Yeast enhancers also are advantageously 
used with yeast promoters. 
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[0128] Promoter sequences are known foreukaryotes. Virtually all eukaryotic genes have an AT-rich region located 
approximately 25 to 30 bases upstream from the site where transcription is initiated. Another sequence found 70 to 80 
bases upstream from the start of transcription of many genes is a CXCAAT (SEQ ID N0:1) region where X may be 
any nucleotide. At the 3' end of most eukaryotic genes is an AATAAA sequence (SEQ ID NO:2) that may be the signal 
5 for addition of the poly A tall to the 3' end of the coding sequence. All of these sequences are suitably inserted into 
mammalian expression vectors. 

[0129] HRG gene transcription from vectors in mammalian host cells is controlled by promoters obtained from the 
genomes of viruses such as polyoma virus, fowlpox virus (UK 2,211 ,504, published 5 July 1989), adenovirus (such as 
Adenovirus 2), bovine-papitloma virus, avian sarcoma virus, cytomegalovirus, a retrovirus, hepatitis-B virus and most 
10 preferably Simian Virus 40 (SV40), from heterologous mammalian promoters, e.g., the actin promoter or an immu- 
noglobulin promoter, from heat-shock promoters, and from the promoter normally associated with HRG sequence, 
provided such promoters are compatible with the host cell systems. 

[0130] The early and late promoters of the SV40 virus are conveniently obtained as an SV40 restriction fragment 
that also contains the SV40 viral origin of replication (Fiers eial., Nature, 273:113 (1978); Mulligan and Berg, Science, 

15 209: 1422-1427 (1 980); Pavlakis et aL, Proc. Natl. Acad. ScL USA, 78: 7398-7402 (1981 )>. The immediate early pro- 
moter of the human cytomegalovirus is conveniently obtained as a Hindlll E restriction fragment (Greenaway et al., 
Gene, 18; 355-360 (1982)). A system for expressing DNA in mammalian hosts using the bovine papilloma virus as a 
vector is disclosed in U.S. Pat. No. 4,419,446. A modification of this system is described in U.S. Pat. No. 4,601,978. 
See also Gray etai. Nature, 295: 503-508 (1982) on expressing cDNA encoding Immune interferon in monkey cells; 

20 Reyes ef aL, Nature, 297: 598-601 (1 982) on expression of human p-interferon cDNA in mouse cells under the control 
of a thymidine kinase promoterfrom herpes simplex virus; Canaani and Berg, Proc. Nati Acad. Sci. USA, 79: 51 66-51 70 
(1982) on expression of the human interferon pi gene in cultured mouse and rabbit cells; and Gomnan et aL, Proc. 
Natl. Acad. Set. USA, 79: 6777-6781 (1982) on expression of bacterial CAT sequences in CV-1 monkey kidney cells, 
chicken embryo fibroblasts, Chinese hamster ovary cells, HeLa cells, and mouse NIH-3T3 cells using the Rous sarcoma 

25 vims long terminal repeat as a promoter. 

(v) Enhancer Element Component 

[0131] Transcription of a DNA encoding HRG of this invention by higher eukaryotes is often increased by inserting 
30 an enhancer sequence into the vector. Enhancers are cis-acting elements of DNA, usually about from 1 0-300 bp, that 
act on a promoter to increase its transcription. Enhancers are relatively orientation and position independent having 
been found 5' (Laimins ef aL, Proc. NatL Acad. Sci USA, 78: 993, 1981) and 3' (Lusky et al, MoL Cell Bio., 3: 1108, 
1 983) to the transcription unit, within an intron (Banerji etai. Cell, 33: 729, 1 983) as well as within the coding sequence 
itself (Osborne etal,, MoL Cell Bio., 4: 1293, 1 984). Many enhancer sequences are now known from mammalian genes 
35 (globin, elastase, albumin, a-fetoprotein and insulin). Typically, however, one will use an enhancer from a eukaryotic 
cell virus. Examples include the SV40 enhancer on the late side of the replication origin (bp 100-270), the cytomega- 
lovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus en- 
hancers (see also Yaniv, Nature, 297: 17-18 (1982)) on enhancing elements for activation of eukaryotic promoters. 
The enhancer may be spliced into the vector at a position 5' or 3' to HRG DNA, but is preferably located at a site 5' 
40 from the promoter. 

(vl) Transcription Termination Component 

[0132] Expression vectors used in eukaryotic host cells (yeast, fungi, insect, plant, animal, human, or nucleated cells 
45 from other multicellular organisms) will also contain sequences necessary for the termination of transcription and for 
stabilizing the mRNA. Such sequences are commonly available from the 5' and, occasionally 3' untranslated regions 
of eukaryotic or viral DN As or cDNAs. These regions contain nucleotide segments transcribed as polyadenylated frag- 
ments in the untranslated portion of the mRNA encoding HRG. The 3' untranslated regions also include transcription 
termination sites. 

50 [0133] Construction of suitable vectors containing one or more of the above listed components the desired coding 
and control sequences employs standard ligation techniques. Isolated plasmids or DNA fragments are cleaved, tailored, 
and religated in the form desired to generate the plasmids required. 

[01 34] For analysis to confirm correct sequences in plasmids constructed, the ligation mixtures are used to transform 
E. coli K12 strain 294 (ATCC 31,445) and successful transformants selected by ampicillin or tetracycline resistance 
55 where appropriate. Plasmids from the transformants are prepared, analyzed by restriction endonuclease digestion, 
and/or sequenced by the method of Messing ef aL, Nucleic Acids Res. 9: 309 (1981) or by the method of Maxam ef 
aL, Methods in Enzymology ^b AQB (1980). 

[01 35] Particularly useful in the practice of this invention are expression vectors that provide for the transient expres- 
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sion in mammalian cells of DNA encoding HRG. In general, transient expression involves the use of an expression 
vector that is able to replicate efficiently in a host cell, such that the host cell accumulates many copies of the expression 
vector and, in turn, synthesizes high levels of a desired polypeptide encoded by the expression vector. Transient ex- 
pression systems, comprising asuitable expression vector and a host cell, allow for the convenient positive identification 

5 of polypeptides encoded by cloned DNAs, as well as for the rapid screening of such polypeptides for desired biological 
or physiological properties. Thus, transient expression systems are particularly useful in the invention for purposes of 
identifying analogs and variants of HRG that have HRG-like activity. Such a transient expression system is described 
in EP 309,237 published 29 March 1 989. Other methods, vectors, and host cells suitable for adaptation to the synthesis 
of HRG in recombinant vertebrate cell culture are described in Gething et ai, Nature293: 620-625, 1981; Mantel et 

10 ai, Nature, 281: 40-46. 1979; Levinson et aL, EP 117,060 and EP 117,058. A particularly useful expression plasmid 
for mammalian cell culture expression of HRG is pRK5 (EP pub. no. 307,247). 

D. Selection and Transformation of Host Cells 

15 [0136] Suitable host cells for cloning or expressing the vectors herein are the prokaryote, yeast, or higher eukaryote 
cells described above. Suitable prokaryotes include eubacteria, such as Gram-negative or Gram-positive organisms, 
for example, E. co//, Bacilli such as B. subtilis, Pseudomonas species such as R aeruginosa, Salmonella typhimurium, 
or Serratia marcescans. One preferred E. co// cloning host is E. co//294 (ATCC 31;446); although other strains such 
as E CO//B, E coli ^1 776 {ATCC 31 ,537) , and E. co//W31 1 0 (ATCC 27,325) are suitable. These examples are illustrative 

20 rather than limiting. Preferably the host cell should secrete minimal amounts of proteolytic enzymes. Alternatively, in 
vitro methods of cloning, e.g.. PCR or other nucleic acid polymerase reactions, are suitable. 

[01 37] In addition to prokaryotes, eukaryotic microbes such as filamentous fungi or yeast are suitable hosts for HRG- 
encoding vectors. Saccharomyces cerevtsiae, or common baker's yeast, is the most commonly used among lower 
eukaryotic host microorganisms. However, a number of other genera, species, and strains are commonly available 

25 and useful herein, such as Schizosaccharomyces pombe (Beach and Nurse, Nature, 290: 140 (1981); EP 139,383, 
published May 2, 1985), Kluyveromyces hosts (U.S.S.N. 4,943,529) such as, e.g., K. lactis (Louvencourt et ai, J. 
Bacterioi, 737 (1983); K. fragilis. K, bulgaricus, K, thermotolerans, and K. marxianus, yarrowia (EP 402,226); Pichia 
pastoris (EP 183,070), Sreekrishna et al„ J. Basic Microbiol., 28: 265-278 (1988); Candida, Trichoderma reesia (EP 
244,234); Neurospora crassa (Case et aL, Proc. Nati Acad. Sci. USA, 76: 5259-5263 (1979), and filamentous fungi 

30 such as, e.g. Neurospora, Penicillium, Tolypocladium (WO 91/00357, published 10 January 1991). and Aspergillus 
hosts such as A. nidulans (BaWance etal., Biochem. Biophys. Res. Commun.,^^2: 284-289 (1983); Tilbum etal.. Gene, 
28: 205-221 (1983); Yelton et aL, Proc. Natl. Acad. Sci. USA, 81:1470-1474 (1984) and A. n/per (Kelly and Hynes. 
EMBOJ., 4: 475-479 (1985)). 

[0138] Suitable host cells for the expression of glycosylated HRG polypeptide are derived from multicellular organ- 

35 isms. Such hostcellsarecapableof complex processing and glycosylation activities. In principle, any higher eukaryotic 
cell culture is workable, whether from vertebrate or invertebrate culture. Examples of invertebrate cells include plant 
and insect cells. Numerous baculoviral strains and variants and corresponding permissive insect host cells from hosts 
such as Spodoptera frugiperda (caterpillar), Aedes aepypf/ (mosquito), Aedes albopictus (mosquito), Drosophila mel- 
anopasfer (fruitfly). and Bombyx morihosX cells have been identified (see, e.g., Luckow etai, Bio/Technology, 6: 47-55 

40 (1 988); Miller et aL, in Genetic Engineering, Setlow, J.K. et al., eds.. Vol. 8 (Plenum Publishing, 1 986), pp. 277-279; 
and Maeda etal., Nature,2AS\ 592-594 (1 985)). A variety of such viral strains are publicly available, e.g.. the L-1 variant 
of Autographa californica NPV and the Bm-5 strain of Bombyx mori NPV, and such viruses may be used as the virus 
herein according to the present invention, particularly for transfection of Spodoptera frugiperda cells. Plant cell cultures 
of cotton, corn, potato, soybean, petunia, tomato, and tobacco can be utilized as hosts. Typically, plant cells are trans- 

45 fected by incubation with certain strains of the bacterium Agrobacterium tumefaciens, which has been previously ma- 
nipulated to contain HRG DNA. During incubation of the plant cell culture with A. tumefaciens, the DNA encoding HRG 
is transfen-ed to the plant cell host such that it is transfected, and will, under appropriate conditions, express HRG DNA. 
In addition, regulatory and signal sequences compatible with plant cells are available, such as the nopaline synthase 
promoter and polyadenylation signal sequences (Depicker et aL, J. MoL Appl. Gen., 1:561 [1982]). In addition, DNA 

50 segments isolated from the upstream region of theT-DNA 7^0 gene are capable of activating or increasing transcription 
levels of plant-expressible genes in recombinant DNA-containing plant tissue (see EP 321,196, published 21 June 
1989). 

[0139] However, interest has been greatest in vertebrate cells, and propagation of vertebrate cells in culture (tissue 
culture) has become a routine procedure in recent years (T/ssue Culture, Academic Press, Kruse and Patterson, editors 
55 (1973)). Examples of useful mammalian host cell lines are monkey kidney CV1 line transfonned by SV40 (COS-7, 
ATCC CRL 1651); human embryonic kidney line (293 or 293 cells subcioned for growth in suspension culture, Graham 
et aL, J. Gen ViroL, 36: 59, 1977); baby hamster kidney cells (BHK, ATCC CCL 10); Chinese hamster ovary cells/- 
DHFR (CHO, Uriaub and Chasin. Proc. NatL Acad. ScL USA, 77:4216 [1980]); mouse Sertoli cells (TM4, Mather, BioL 
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Reprod., 23:243-251 [1980]); monkey kidney cells (CV1 ATCC CCL 70); African green monkey kidney cells (VERO- 
76. ATCC CRL-1587); human cervical carcinoma cells {HELA. ATCC CCL 2); canine kidney cells (MDCK, ATCC CCL 
34); buffalo rat liver cells (BRL 3A, ATCC CRL 1442); human lung cells (W138, ATCC CCL 75); human liver cells (Hep 
G2, HB 8065); mouse mammary tumor (MMT 060562, ATCC CCL51 ); TRI cells (Mather et a/.. Annals N. Y, Acad. Sci, 
5 383:44-68 (1 982]); MRC 5 cells; FS4 cells; and a human hepatoma cell line (Hep G2). Preferred host cells are human 
embryonic kidney 293 and Chinese hamster ovary cells. 

[01 40] Host cells are transfected and preferably transf omned with the above-described expression or cloning vectors 
of this invention and cultured in conventional nutrient media modified as appropriate for inducing promoters, selecting 
transfonmants, or amplifying the genes encoding the desired sequences. 
10 [0141] Transfection refers to the taking up of an expression vector by a host cell whether or not any coding sequences 
are in fact expressed. Numerous methods of transfection are known to the ordinarily skilled artisan, for example, CaPQ4 
and electroporation. Successful transfection is generally recognized when any indication of the operation of this vector 
occurs within the host cell. 

[0142] Transformation means introducing DNA into an organism so that the DNA is replicable, either as an extra- 
15 chromosomal element or by chromosomal integration. Depending on the host cell used, transformation is done using 
standard techniques appropriate to such cells. The calcium treatment employing calcium chloride, as described in 
section 1 .82 of Sambrook etal., supra, is generally used for prokaryotes or other cells that contain substantial cell-wall 
barriers. Infection with Agrobacteiium tumefaciens is used for transfomnation of certain plant cells, as described by 
Shaw et ai. Gene, 23:31 5 (1 983) and WO 89/05859, published 29 June 1 989. For mammalian celts without such cell 
20 walls, the calcium phosphate precipitation method described in sections 16.30-16.37 of Sambrook et al, supra, is 
prefenred. General aspects of mammalian cell host system transformations have been described by Axel in U.S. Pat. 
No. 4,399,216, issued 16 August 1983. Transfonnations into yeast are typically carried out according to the method 
of Van Solingen et al. , J. Bad, 130:946 (1 977) and Hsiao etal, Proc. Natl. Acad. Sci. {USA), 76: 3829 (1 979). However, 
other methods for introducing DNA into cells such as by nuclear injection, electroporation, or protoplast fusion may 
25 also be used. 

E. Culturing the Host Cells 

[01 43] Prokaryotic cells used to produce H RG polypeptide of this invention are cultured in suitable media as described 

30 generally in Sambrook et al., supra. 

[0144] The mammalian host cells used to produce HRG of this invention may be cultured in a variety of media. 
Commercially available media such as Ham's F10 (Sigma), Minimal Essential Medium ([MEM], Sigma), RPMI-1640 
(Sigma), and Dulbecco's Modified Eagle's Medium ([DMEM], Sigma) are suitable for culturing the host cells. In addition, 
any of the media described in Ham and Wallace, Meth. Enz., 58; 44 (1979), Bames and Sato, Anal, Biochem., 102: 

35 255 (1980), U.S. Pat. Nos. 4,767,704; 4,657,866; 4,927,762; or4,560,655; WO 90/03430; WO 87/00195 and U.S. Pat. 
Re. 30,985, may be used as culture media for the host cells. Any of these media may be supplemented as necessary 
with hormones and/or other growth factors (such as insulin, transferrin, or epidermal growth factor), salts (such as 
sodium chloride, calcium, magnesium, and phosphate), buffers (such as HEPES), nucleosides (such as adenosine 
and thymidine), antibiotics (such as Gentamycin^" drug), trace elements (defined as inorganic compounds usually 

40 present at final concentrations in the micromolar range), and glucose or an equivalent energy source. Any other nec- 
essary supplements may also be included at appropriate concentrations that would be known to those skilled in the 
art. The culture conditions, such as temperature, pH, and the like, are those previously used with the host cell selected 
for expression, and will be apparent to the ordinarily skilled artisan. 

[0145] The host cells referred to in this disclosure encompass cells in in vitro culture as well as cells that are within 
45 a host animal. 

[0146] It is further envisioned that HRG of this invention may be produced by homologous recombination, or with 
recombinant production methods utilizing control elements introduced into cells already containing DNA encoding HRG 
currently in use in the field. For example, a powerful promoter/enhancer element, a suppressor, or an exogenous 
transcription modulatory element is inserted in the genome of the intended host cell in proximity and orientation sufficient 
50 to influence the transcription of DNA encoding the desired HRG. The control element does not encode HRG of this 
invention, but the DNA is present in the host cell genome. One next screens for cells making HRG of this invention, or 
increased or decreased levels of expression, as desired. 

F. Detecting Gene Amplification/Expression 

55 

[0147] Gene amplification andlor expression may be measured in a sample directly, for example, by conventional 
Southern blotting. Northern blotting to quantitate the transcription of mRNA (Thomas, Proc. Natl. Acad. Sci. USA, 77: 
5201 -5205 [1 980]), dot blotting (DNA analysis), or in situ hybridization, using an appropriately labeled probe based on 
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the sequences provided herein. Various labels may be employed, most commonly radioisotopes, particularly 32p. How- 
ever, other techniques may also be employed, such as using biotin-modified nudeotides for introduction into a polynu- 
cleotide. The biotin then serves as the site for binding to avidin or antibodies which may be labeled with a wide variety 
of labels, such as radionuclides, fluorescers, enzymes, or the Ike. Alternatively, antibodies may be employed that can 
recognize specific duplexes, including DNA duplexes, RNA duplexes, and DNA-RNA hybrid duplexes or DNA-protein 
duplexes. The antibodies in turn may be labeled and the assay may be can-ied out where the duplex is bound to a 
surface, so that upon the formation of duplex on the surface, the presence of antibody bound to the duplex can be 
detected. 

[0148] Gene expression, alternatively, may be measured by immunological methods, such as immunohistochemical 
staining of tissue sections and assay of cell culture or body fluids, to quantitate directly the expression of gene product. 
With immunohistochemical staining techniques, a cell sample is prepared, typically by dehydration and fixation, fol- 
lowed by reaction with labeled antibodies specific for the gene product coupled where the labels are usually visually 
detectable such as enzymatic labels, fluorescent labels, luminescent labels, and the like. A particularly sensitive staining 
technique suitable for use in the present invention is described by Hsu et ai, Am. J. Clin. Path., 75: 734-738 (1980). 
[01 49] Antibodies useful for immunohistochemical staining and/or assay of sample fluids may be either monoclonal 
or polyclonal, and may be prepared in any mammal. Conveniently, the antibodies may be prepared against a native 
HRG polypeptide or against a synthetic peptide based on the DNA sequences provided herein as described further in 
Section 4 below. 

G. Purification of The Heregulin Polypeptide 

[0150] HRG is recovered from a cellular membrane fraction. Alternatively, a proteolyticalLy cleaved or a truncated 
expressed soluble HRG fragment or subdomain are recovered from the culture medium as soluble polypeptides. 
[0151] When HRG is expressed in a recombinant cell other than one of human origin. HRG is completely free of 
proteins or polypeptides of human origin. However, it is desirable to purify HRG from recombinant cell proteins or 
polypeptides to obtain preparations that are substantially homogeneous as to HRG. As a first step, the culture medium 
or lysate is centrifuged to remove particulate cell debris. The membrane and soluble protein fractions are then sepa- 
rated. HRG is then purified from both the soluble protein fraction (requiring the presence of a protease) and from the 
membrane fraction of the culture lysate, depending on whether HRG is membrane bound. The following procedures 
are exemplary of suitable purification procedures: fractionation on immunoaffinity or ion-exchange columns; ethanol 
precipitation; reversed phase HPLC; chromatography on silica, heparin sepharose or on a cation exchange resin such 
as DEAE; chromatofocusing; SDS-PAGE; ammonium sulfate precipitation; and gel filtration using, for example, Sepha- 
dex G-75. 

[0152] HRG variants in which residues have been deleted, inserted or substituted are recovered in the same fashion 
as the native HRG, taking account of any.substantial changes in properties occasioned by the variation. For example, 
preparation of a HRG fusion with another protein or polypeptide, e.g., a bacterial or viral antigen, facilitates purification; 
an immunoaffinity column containing antibody to the antigen can be used to adsoriD the fusion . Immunoaffinity columns 
such as a rabbit polyclonal anti-HRG column can be employed to absoriD HRG variant by binding it to at least one 
remaining immune epitope. A protease inhibitor such as phenylmethylsulfonylfluoride (PMSF) also may be useful to 
inhibit proteolytic degradation during purification, and antibiotics may be included to prevent the growth of adventitious 
contaminants. One skilled in the art will appreciate that purification methods suitable for native HRG may require mod- 
ification to account for changes in the character of HRG variants upon expression in recombinant cell culture. 

H. Covalent Modifications of HRG 

[0153] Covalent modifications of HRG polypeptides are included within the scope of this invention. Both native HRG 
and amino acid sequence variants of HRG optionally are covalently modified. One type of covalent modification included 
within the scope of this invention is a HRG polypeptide fragment. HRG fragments, such as HRG-GDF, having up to 
about 40 amino acid residues are conveniently prepared by chemical synthesis, or by enzymatic or chemical cleavage 
of the full-length HRG polypeptide or HRG variant polypeptide. Other types of covalent modifications of HRG or frag- 
ments thereof are introduced into the molecule by reacting targeted amino acid residues of HRG or fragments thereof 
with an organic derivatizing agent that is capable of reacting with selected side chains or the N- or C-terminal residues. 
[0154] Cysteinyl residues most commonly are reacted with a-haloacetates (and corresponding amines), such as 
chloroacetic acid or chloroacetamide, to give carboxymethyl or carboxyamidomethyl derivatives. Cysteinyl residues 
also are derivatized by reaction with bromotrifluoroacetone, a-bromo-p-(5-imidozoyl)propionic acid, chloroacetyl phos- 
phate, N-alkylmaleimides, 3-nitro-2-pyridyl disulfide, methyl 2-pyridyl disulfide, p-chloromercuribenzoate, 2-chloromer- 
curi-4-nitrophenol, or ch!oro-7-nitrobenzo-2-oxa-1 ,3-diazole. 

[0155] Histidyl residues are derivatized by reaction with diethylpyrocarbonate at pH 5.5-7.0 because this agent is 
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relatively specific for the histidyl side chain. Para-bromophenacyt bromide also is useful; the reaction is preferably 
performed in 0.1 M sodium cacodylate at pH 6.0. 

[0156] Lysinyl and amino terminal residues are reacted with succinic or other carboxylic acid anhydrides. Derivatl- 
zation with these agents has the effect of reversing the charge of the lysinyl residues. Other suitable reagents for 
5 derivatizing a-anni no-containing residues include imidoesters such as methyl picolinimidate; pyridoxal phosphate; py- 
ridoxal; chJoroborohydride; trinitrobenzenesulfonic acid; O-methylisourea; 2,4-pentanedione; and transanninase-cata- 
lyzed reaction with glyoxylate. 

[0157] Arginyl residues are modified by reaction with one or several conventional reagents, among them phenyigly- 
oxal, 2,3-butanedione, 1 ,2-cyclohexanedione, and ninhydrin. Derivatization of arginine residues requires that the re- 

10 action be perfonned in alkaline conditions because of the high pK^ of the guanldtne functional group. Furthermore, 
these reagents may react with the groups of lysine as well as the arginine epsilon-amino group. 
[01 58] The specific modification of tyrosyl residues may be made, with particular interest in introducing spectral labels 
into tyrosyi residues by reaction with aromatic diazontum compounds ortetranltromethane. Most commonly, N-acetyli- 
midizole and tetranitromethane are used to fomn O-acetyl tyrosyl species and 3-nitro derivatives, respectively. Tyrosyl 

15 residues are iodinated using '•ss) or "'3'' I to prepare labeled proteins for use in radioimmunoassay, the chloramine T 
method described above being suitable. 

[0159] Carboxyl side groups (aspartyl or glutamyl) are selectively modified by reaction with carbodiimides (R'- 
N=C=N-R'), where R and R' are different alkyl groups, such as 1-cyclohexyl-3-(2-morpholinyl-4-ethyl) carbodiimide or 
1-ethyl-3-(4-azonia-4,4-dimethylpentyl) carbodiimide, Furthemnore, aspartyl and glutamyl residues are converted to 

20 asparaginyl and glutaminyl residues by reaction with ammonium ions. 

[0160] Derivatization with bifunctional agents is useful for Crosslin king HRG to a water-insoluble support matrix or 
surface for use in the method for purifying anti-HRG antibodies, and vice versa. Commonly used crosslinking agents 
include, e.g., 1 ,1-bis(dtazoacetyI)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, for example, esters 
with 4-a2idosalicylic acid, homobifunctional imidoesters, including disuccinimidyl esters such as 3,3'-dithiobts(succin- 

25 imidylpropionate), and bifunctional maleimides such as bis-N-maleimido-1 ,8-octane. Derivatizing agents such as me- 
thyl-3-[(p-azidophenyl)-dithio]propioimidate yield photoactivatable intermediates that are capable of forming crosslinks 
in the presence of light. Alternatively, reactive water-insoluble matrices such as cyanogen bromide-activated carbohy- 
drates and the reactive substrates described in U.S. Pat. Nos. 3,969,287; 3,691 ,01 6; 4,1 95,128; 4,247,642; 4,229,537; 
and 4,330,440 are employed for protein immobilization. 

30 [0161] Glutaminyl and asparaginyl residues are frequently deamidated to the corresponding glutamyl and aspartyl 
residues, respectively. Alternatively, these residues are deamidated under mildly acidic conditions. Either fonn of these 
residues falls within the scope of this invention. 

[0162] Other modifications include hydroxylation of proline and lysine, phosphorylation of hydroxyl groups of seryl 
or threonyl residues, methylation of the a-amino groups of lysine, arginine, and histidine side chains (T.E. Creighton, 
35 Proteins: Structure and Molecular Properties , W.H. Freeman & Co., San Francisco, pp. 79-86 [1983]), acetylation of 
the N-terminal amine, and amidation of any C-termlnal carboxyl group. 

[0163] HRG optionally is fused with a polypeptide heterologous to HRG. The heterologous polypeptide optionally is 
an anchor sequence such as that found in the decay accelerating system (DAF); a toxin such as ricin, pseudomonas 
exotoxin, gelonin, or ether polypeptide that will result in target cell death. These heterologous polypeptides are cova- 
40 lently coupled to HRG through side chains or through the temninal residues. Similarly, HRG is conjugated to other 
molecules toxic or inhibitory to a target mammalian cell, e.g. such as tricothecenes, or antisense DNA that blocks 
expression of target genes. 

[0164] HRG also is covalently modified by altering its native glycosylation pattem. One or more carbohydrate sub- 
stitutents are modified by adding, removing or varying the monosaccharide components at a given site, or by modifying 

45 residues in HRG such that glycosylation sites are added or deleted. 

[0165] Glycosylation of polypeptides is typically either N-linked or O-linked. N-linked refers to the attachment of the 
carbohydrate moiety to the side chain of an asparagine residue. The tri-peptide sequences asparagine-X-serine and 
asparagine-X-threonine, where X is any amino acid except proline, are the recognition sequences for enzymatic at- 
tachment of the cariDohydrate moiety to the asparagine side chain. Thus, the presence of either of these tri-peptide 

50 sequences in a polypeptide creates a potential glycosylation site. 0-Iinked glycosylation refers to the attachment of 
one of the sugars N-acetylgalactosamine, galactose, or xylose, to a hydroxyamino acid, most commonly serine or 
threonine, although 5-hydroxyproline or 5-hydroxylysine may also be used. 

[0166] Glycosylation sites are added to HRG by altering its amino acid sequence to contain one or more of the above- 
described tri-peptide sequences (for N-linked glycosylation sites). The alteration may also be made by the addition of, 
55 or substitution by, one or more serine or threonine residues to HRG (for O-linked glycosylation sites). For ease, HRG 
is preferably altered through changes at the DNA level, particularly by mutating the DNA encoding HRG at preselected 
bases such that codons are generated that will translate into the desired amino acids. 

[0167] Chemical or enzymatic coupling of glycosides to HRG Increases the number of carbohydrate substituents. 
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These procedures are advantageous in that they do not require production of the polypeptide in a host cell that is 
capable of N- and O- linked glycosylation. Depending on the coupling mode used, the sugar(s) may be attached to (a) 
arginine and histidine, (b) free carboxyl groups, (c) free sulfhydryl groups such as those of cysteine, (d) free hydroxyl 
groups such as those of serine, threonine, or hydroxyproline, (e) aromatic residues such as those of phenylalanine, 
tyrosine, or tryptophan, or (f) the amide group of glutamine. These methods are described in WO 87/05330, published 
11 September 1 987, and in Aplin and Wriston (CRC Crit. Rev. Biochem., pp. 259-306 [1981]). 
[0168] Carbohydrate moieties present on an HRG also are removed chemically or enzymaticaliy. Chemical deglyc- 
osylation requires exposure of the polypeptide to the compound trifluoromethanesulfonic add, or an equivalent com- 
pound. This treatment results in the cleavage of most or all sugars except the linking sugar (N-acetylglucosamine or 
N -acetylgalactosamine), while leaving the polypeptide intact. Chemical deglycosylation is described by Hakimuddin et 
ai (Arch. Biochem. Biophys.. 259:52 [1987]) end by Edge etai (Anal. Biochem., 118:131 [1981]). Carbohydrate moi- 
eties are removed from HRG by a variety of endo- and exo- glycosidases as described by Thotakura ef at. [Meth. 
Enzymoi, 138:350 [1987]). 

[0169] Glycosylation added during expression in cells also is suppressed by tunlcamycin as described by Duskin et 
ai [J. Biol. Chew., 257:3105 [1982}). Tunicamycin blocks the formation of protein-fvl -glycoside linkages. 
[0170] HRG also is modified by linking HRG to various nonproteinaceous polymers, e.g., polyethylene glycol, poly- 
propylene glycol or polyoxyalkylenes, in the manner set forth in U.S. Pat. Nos. 4,640,835; 4,496,689; 4,301,144; 
4,670,417; 4,791,192 or 4,179,337. 

[0171] One preferred way to increase the in vivo circulating half life of non-membrane bound HRG is to conjugate it 
to a polymer that confers extended half-life, such as polyethylene glycol (PEG). (Maxfield, etal. Polymer 16,505-509 
[1975]; Bailey, F. E., etal, in Nonionic Surfactants [Schick, M. J., ed.] pp.794-821 [1 967]; Abuchowski, A. etal, J. Biol. 
Chem. 252:3582-3586 [1977]; Abuchowski, A. etai.. Cancer Biochem. Biophys. 7:175-186 [1984]; Katre, N.V. etal., 
Proc. Natl. Acad Sci., 84:1487-1491 [1987]; Goodson, R. et aL Bio Technology 8:343-346:[1990]). Conjugation to 
PEG also has been reported to have reduced immunogenicity and toxicity (Abuchowski, A. etal., J. Biol. Chem., 252: 
3578-3581 [1977]). 

[0172] HRG also is entrapped in microcapsules prepared, for example, by coacervation techniques or by interfacial 
polymerization (for example, hydroxymethylcellulose or gelatin-microcapsuies and poly-[methylmethacylate] micro- 
capsules, respectively), in colloidal drug delivery systems (for example, liposomes, albumin microspheres, microemul- 
sions, nanoparticles and nanocapsules), or in macroemulsions. Such techniques are disclosed in Remington's Phar- 
maceutical Sciences , 16th edition, Osol, A., Ed., (1980). 

[0173] HRG is also useful in generating antibodies, as standards in assays for HRG (e.g., by labeling HRG for use 
as a standard in a radioimmunoassay, enzyme-linked immunoassay, or radioreceptor assay), in affinity purification 
techniques, and in competitive-type receptor binding assays when labeled with radioiodine, enzymes, fluorophores, 
spin labels, and the like. 

[0174] Those skilled in the art will be capable of screening variants in order to select the optimal variant for the 
purpose intended. For example, a change in the immunological character of HRG, such as a change in affinity for a 
given antigen or for the HER2 receptor, is measured by a competitive-type immunoassay using a standard or control 
such as a native HRG (in particular native HRG-GFD). Other potential modifications of protein or polypeptide properties 
such as redox or thermal stability, hydrophobicity, susceptibility to proteolytic degradation, stability In recombinant cell 
culture or in plasma, or the tendency to aggregate with carriers or into multimers are assayed by methods well known 
in the art. 

1 . Therapeutic use of Heregulin Ligands 

[0175] While the role of the p'lBS^^^^ and its ligands is unknown in normal cell growth and differentiation, it is an 
object of the present invention to develop therapeutic uses for the p1B5^^^^ ligands of the present invention in pro- 
moting normal growth and development and in inhibiting abnonnal growth, specifically in malignant or neoplastic tis- 
sues. 

2. Therapeutic Compositions and Administration of HRG 

[0176] Therapeutic formulations of HRG or HRG antibody are prepared for storage by mixing the HRG protein having 
the desired degree of purity with optional physiologically acceptable carriers, excipients, or stabilizers (Remington's 
Pharmaceutical Sciences, supra), in the form of lyophilized cake or aqueous solutions. Acceptable carriers, excipients 
or stabilizers are nontoxic to recipients at the dosages and concentrations employed, and include buffers such as 
phosphate, citrate, and other organic acids; antioxidants including ascorbic acid; low molecular weight (less than about 
10 residues) polypeptides (to prevent methoxide formation); proteins, such as serum albumin, gelatin, or immunoglob- 
ulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, arginine 
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or lysine; monosaccharides, disaccharides, and other carbohydrates including glucose, mannose, or dextrins; chelating 
agents such as EDTA; sugar alcohols such as mannitol or sorbitol; salt-forming counterions such as sodium; and/or 
nonionic surfactants such as Tween, Pluronics or polyethylene glycol (PEG). 

[0177] HRG or HRG antibody to be used for in vivo administration must be sterile. This is readily accomplished by 
5 filtration through sterile filtration membranes, prior to or following lyophtlization and reconstitution. HRG or antibody to 
an HRG ordinarily will be stored in lyophilized form or in solution. 

[0178] Therapeutic HRG, or HRG specific antibody compositions generally are placed into a container having a 
sterile access port, for example, an intravenous solution bag or vial having a stopper pierceable by a hypodermic 
injection needle. 

10 [01 79] HRG, its antibody or HRG variant when used as an antagonist may be optionally combined with or adminis- 
tered in concert with other agents known for use in the treatment of malignacies. When HRG is used as an agonist to 
stimulate the HER2 receptor, for example in tissue cultures, it may be combined with or administered in concert with 
other compositions that stimulate growth such as PDGF, FGF, EGF, growth homnone or other protein growth factors. 
[01 80] The route of HRG or HRG antibody administration is in accord with known methods, e.g., injection or infusion 

15 by intravenous, intraperitoneal, intracerebral, intramuscular, intraocular, intraarterial, or intralesional routes, or by sus- 
tained release systems as noted below. HRG is administered continuously by infusion or by bolus injection. HRG 
antibody is administered in the same fashion, or by administration into the blood stream or lymph. 
[0181] Suitable examples of sustained-release preparations include semipermeable matrices of solid hydrophobic 
polymers containing the protein, which matrices are in the fonn of shaped articles, e.g. films, or microcapsules. Exam- 

20 pies of sustained-release matrices include polyesters, hydrogels [e.g., poly (2 -hydroxy ethyl-methacrylate) as described 
by Langerefa/., J. Biomed. Mater. Res., 15:167-277 (1981) and Larger, Chem. Tec/?., 12:98-105 (1982) or poly(viny- 
lalcohol)], polylactides (U.S. Pat. No. 3,773,919, EP 58,481), copolymers of L-glutamic acid and gamma ethyl-L-gluta- 
mate (Sidman et ai, Biopolymers, 22:547-556 [1 983]), non-degradable ethylene-vinyl acetate (Langer et a!., supra), 
degradable lactic acid-glycolic acid copolymers such as the Lupron Depot*^" (injectable micropheres composed of 

25 lactic acid-glycolic acid copolymer and leuprolide acetate), and poly-D-(-)-3-hydroxybutyric acid (EP 133,988). While 
polymers such as ethylene-vinyl acetate and lactic acid-glycolic add enable release of molecules tor over 100 days, 
certain hydrogels release proteins for shorter time periods. When encapsulated proteins remain in the body for a long 
time, they may denature or aggregate as a result of exposure to moisture at 37**C, resulting in a loss of biological 
activity and possible changes in immunogenicity. Rational strategies can be devised for protein stabilization depending 

30 on the mechanism involved. For example, if the aggregation mechanism is discovered to be intermolecular S-S bond 
formation through-thio-disulfide interchange, stabilization may be achieved by modifying sulfhydryl residues, lyophiliz- 
ing from acidic solutions, controlling moisture content, using appropriate additives, and developing specific polymer 
matrix compositions. 

[01 82] Sustained-release HRG or antibody compositions also include liposomally entrapped HRG or antibody. Lipo- 
35 somes containing HRG or antibody are prepared by methods known perse: DE 3,21 8,121 ; Epstein ef ai, Proa NatL 
Acad Sci. USA, 82:3688-3692 (1985); Hwang et a/., Proc. Natl. Acad. Sci. USA, 77:4030-4034 (1980); EP 52,322; 
EP 36,676; EP 88,046; EP 143,949; EP 142,641; Japanese patent application 83-118008; U.S. Pat. No. 4,485,045 
and 4,544,545; and EP 102,324. Ordinarily the liposomes are of the small (about 200-800 Angstroms) unilamelartype 
in which the lipid content is greater than about 30 mol. % cholesterol, the selected proportion being adjusted for the 
40 optimal HRG therapy. Liposomes with enhanced circulation time are disclosed in U.S. Pat. No. 5,013,556. 

[01 83] Another use of the present invention comprises incorporating HRG polypeptide or antibody into fonned arti- 
cles. Such articles can be used in modulating cellular growth and development. In addition, cell growth and division 
and tumor invasion may be modulated with these articles. 

[0184] An effective amount of HRG or antibody to be employed therapeutically will depend, for example, upon the 
45 therapeutic objectives, the route of administration, and the condition of the patient. Accordingly, it will be necessary for 
the therapist to titer the dosage and modify the route of administration as required to obtain the optimal therapeutic 
effect. A typical daily dosage might range from about 1 ^g/kg to up to 100 mg/kg or more, depending on the factors 
mentioned above. Typically, the clinician will administer HRG or antibody until a dosage is reached that achieves the 
desired effect. The progress of this therapy is easily monitored by conventional assays. 

50 

3. Heregulin Antibody Preparation and Therapeutic Use 

[0185] The antibodies of this invention are obtained by routine screening. Polyclonal antibodies to HRG generally 
are raised in animals by multiple subcutaneous (sc) or intraperitoneal (ip) injections of HRG and an adjuvant. It may 
55 be useful to conjugate HRG or an HRG fragment containing the target amino acid sequence to a protein that is immu- 
nogenic in the species to be immunized, e.g., keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, or 
soybean trypsin inhibitor using a bifunctional or derivatizing agent, for example, maleimidobenzoyl sulfosuccinimide 
ester (conjugation through cysteine residues), N-hydroxysuccinimide (through lysine residues), glutaraldehyde, suc- 
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cinic anhydride, SOClg. or R^N = C = NR, where R and R^ are different alkyl groups. 

[0186] The route and schedule of immunizing an animal or removing and culturing antibody-producing cells are 
generally in keeping with established and conventional techniques for antibody stimulation and production. While mice 
are frequently immunized, it is contemplated that any mammalian subject including human subjects or antibody-pro- 
ducing cells obtained therefrom can be immunized to generate antibody producing cells. 

[01 87] Subjects are typically immunized against HRG or its immunogenic conjugates or derivatives by combining 1 
mg or 1 \ig of HRG immunogen (for rabbits or mice, respectively) with 3 volumes of Freund's complete adjuvant and 
injecting the solution intradermally at multiple sites. One month later the subjects are boosted with 1/5 to 1/10 the 
original amount of immunogen in Freund's complete adjuvant (or other suitable adjuvant) by subcutaneous injection 
at multiple sites. 7 to 14 days later animals are bled and the semm is assayed for anti-HRG antibody titer. Subjects 
are boosted until the titer plateaus. Preferably, the subject is boosted with a conjugate of the same HRG, but conjugated 
to a different protein and/or through a different cross-linking agent. Conjugates also can be made in recombinant cell 
culture as protein fusions. Also, aggregating agents such as alum are used to enhance the immune response. 
[0188] After immunization, monoclonal antibodies are prepared by recovering immune lymphoid cells-typically spleen 
cells or lymphocytes from lymph node tissue-from immunized animals and immortalizing the cells in conventional fash- 
ion, e.g., by fusion with myeloma cells or by Epstein-Barr(EB)-virus transformation and screening for clones expressing 
the desired antibody. The hybridoma technique described originally by Kohler and Milstein, Eur. J. Immunol. 6:511 
(1976) has been widely applied to produce hybrid cell lines that secrete high levels of monoclonal antibodies against 
many specific antigens. 

[01 89] It is possible to fuse cells of one species with another. However, it is preferable that the source of the immunized 
antibody producing cells and the myeloma be from the same species. 

[0190] Hybridoma cell lines producing antiHRG are identified by screening the culture supematants for antibody 
which binds to HRG. This is routinely accomplished by conventional immunoassays using soluble HRG preparations 
or by FACS using cell-bound HRG and labelled candidate antibody. 

[0191] The hybrid cell lines can be maintained in culture in vitro in cell culture media. The cell lines of this invention 
can be selected and/or maintained in a composition comprising the continuous cell line in hypoxanthine-aminopterin 
thymidine (HAT) medium. In fact, once the hybridoma cell line is established, it can be maintained on a variety of 
nutritionally adequate media. Moreover, the hybrid cell lines can be stored and preserved in any number of conventional 
ways, including freezing and storage under liquid nitrogen. Frozen cell lines can be revived and cultured indefinitely 
with resumed synthesis and secretion of monoclonal antibody. The secreted antibody is recovered from tissue culture 
supernatant by conventional methods such as precipitation, ion exchange chromatography, affinity chromatography, 
or the like. The antibodies described herein are also recovered from hybridoma cell cultures by conventional methods 
for purification of IgG or IgM as the case may be that heretofore have been used to purify these immunoglobulins from 
pooled plasma, e.g., ethanol or polyethylene glycol precipitation procedures. The purified antibodies are sterile filtered, 
and optionally are conjugated to a detectable marker such as an enzyme or spin label for use in diagnostic assays of 
HRG in test samples. 

[0192] While mouse monoclonal antibodies routinely are used, the invention is not so limited; in fact, human anti- 
bodies may be used and may prove to be preferable. Such antibodies can be obtained by using human hybridomas 
(Cote etal., Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, p. 77 (1985)). Chimeric antibodies, Cabilly et 
al., (Mon-ison etal., Proc. Natl. Acad. Sci., 81:6851 (1984); Neuberger et al.. Nature 3^2:S04 (1984); Takeda et ai, 
Nature 3^4:452 (1985)) containing a murine anti-HRG variable region and a human constant region of appropriate 
biological activity (such as ability to activate human complement and mediate ADCC) are within the scope of this 
invention, as are humanized anti-HRG antibodiesproduced by conventional CRD-gratting methods. 
[0193] Techniques for creating recombinant DNA versions of the antigen-binding regions of antibody molecules 
(known as Fab or variable regions fragments) which bypass the generation of monoclonal antibodies are encompassed 
within the practice of this invention. One extracts antibody-specific messenger RNA molecules from immune system 
cells taken from an immunized subject, transcribes these into complementary DNA (cDNA), and clones the cDNA into 
a bacterial expression system and selects for the desired binding characteristic. The Scripps/Stratagene method uses 
a bacteriophage lambda vector system containing a leader sequence that causes the expressed Fab protein to migrate 
to the periplasmic space (between the bacterial cell membrane and the cell walk or to be secreted. One can rapidly 
generate and screen great numbers of functional Fab fragments to identify those which bind HRG with the desired 
characteristics, 

[0194] Antibodies specific to HRG-a, HRG-pi, HRG-P2 and HRG-p3 may be produced and used in the manner 
described above. HRG-a, HRG-pi , HRG-p2 and HRG-p3 specific antibodies of this invention preferably do not cross- 
react with other members of the EGF family (Fig. 6) or with each other. 

[0195] Antibodies capable of specifically binding to the HRG-NTD, HRG-GFD or HRG-CTP are of particular interest. 
Also of interest are antibodies capable of specifically binding to the proteolytic processing sites between the GFD and 
transmembrane domains. These antibodies are identified by methods that are conventional per se. For example, a 
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bank of candidate antibodies capable of binding to HRG-ECD or proHRG are obtained by the above nnethods using 
innmunization with full proHRG. These can then be subdivided by their ability to bind to the various HRG domains using 
conventional mapping techniques. Less preferably, antibodies specific for a predetermined domain are initially raised 
by immunizing the subject with a polypeptide comprising substantially only the domain in question, e.g. HRG-GFD free 

5 of NTD or CTP polypeptides. These antibodies will not require mapping unless binding to a particular epitope is desired. 
[0196] Antibodies that are capable of binding to proteolytic processing sites are of particular interest. They are pro- 
duced either by immunizing with an HRG fragment that includes the CTP processing site, with intact HRG, or with 
HRG-NTD-GFD and then screening for the ability to block or inhibit proteolytic processing of HRG into the NTD-GFD 
fragment by recombinant host cells or isolated cell lines that are otherwise capable of processing HRG to the fragment. 

10 These antibodies are useful for suppressing the release of NTD-GFD and therefore are promising for use in preventing 
the release of NTD-GFD and stimulation of the HER-2 receptor. They also are useful in controlling cell growth and 
replication. Anti-GFD antibodies are useful for the same reasons, but may not be as efficient biologically as antibodies 
directed against a processing site. 

[0197] Antibodies are selected that are capable of binding only to one of the members of the HRG family, e.g. HRG- 
15 alpha or any one of the HRG-beta isoforms. Since each of the HRG family members has a distinct GFD-transmembrane 
domain cleavage site, antibodies directed specifically against these unique sequences will enable the highly specific 
inhibition of each of the GFDs or processing sites, and thereby refine the desired biological response. For example, 
breast carcinoma cells which are HER-2 dependent may in fact be activated only by a single GFD isotype or, if not, 
the activating GFD may originate only from a particular processing sequence, either on the HER-2 bearing cell itself 
20 or on a GFD-generating cell. The identification of the target activating GFD or processing site is a straight-fonvard 
matter of analyzing HER-2 dependent carcinomas, e.g., by analyzing the tissues for the presence of a particular GFD 
family member associated with the receptor, or by analyzing the tissues for expression of an HRG family member 
(which then would serve as the therapeutic target). These selective antibodies are produced In the same fashion as 
described above, either by immunization with the target sequence or domain, or by selecting from a bank of antibodies 
25 having broader specificity. 

[0198] As described above, the antibodies should have high specificity and affinity for the target sequence. For 
example, the antibodies directed against GFD sequences should have greater affinity for the GFD than GFD has for 
the HER-2 receptor. Such antibodies are selected by routine screening methods. 

30 4. Non-Therapeutic Uses of Heregulin and its Antibodies 

[0199] The nucleic acid encoding HRG may be used as a diagnostic for tissue specific typing. For example, such 
procedures as in situ hybridization, and Northem and Southern blotting, and PGR analysis may be used to detemnine 
whether DMA andlor RNA encoding HRG are present in the cell type(s) being evaluated. In particular, the nucleic acid 
35 may be useful as a specific probe for certain types of tumor cells such as, for example, mammary gland, gastric and 
colon adenocarcinomas, salivary gland and other tissues containing the pIBS^^f^^ 

[0200] Isolated HRG may be used in quantitative diagnostic assays as a standard or control against which samples 
containing unknown quantities of HRG may be compared. 

[0201] Isolated HRG may be used as a growth factor for /n vitro ceW culture, and invivo to promote the growth of cells 

40 containing p^S5^^^^ or other analogous receptors. 

[0202] HRG antibodies are useful in diagnostic assays for HRG expression in specific cells or tissues. The antibodies 
are labeled In the same fashion as HRG described above and/or are immobilized on an insoluble matrix. 
[0203] HRG antibodies also are useful for the affinity purification of HRG from recombinant cell culture or natural 
sources. HRG antibodies that do not detectably cross-react with other HRG can be used to purify HRG free from other 

45 known ligands or contaminating protein. 

[0204] Suitable diagnostic assays for HRG and its antibodies are well known perse. Such assays include competitive 
and sandwich assays, and steric inhibition assays. Competitive and sandwich methods employ a phase-separation 
step as an integral part of the method while steric inhibition assays are conducted in a single reaction mixture. Funda- 
mentally, the same procedures are used for the assay of HRG and for substances that bind HRG, although certain 

50 methods will be favored depending upon the molecular weight of the substance being assayed. Therefore, the sub- 
stance to be tested is referred to herein as an analyte, irrespective of its status otherwise as an antigen or antibody, 
and proteins that bind to the analyte are denominated binding partners, whether they be antibodies, cell surface re- 
ceptors, or antigens. 

[0205] Analytical methods for HRG or its antibodies all use one or more of the following reagents: labeled analyte 
55 analogue, immobilized analyte analogue, labeled binding partner, immobilized binding partner and steric conjugates. 
The labeled reagents also are known as "tracers." 

[0206] The label used (and this is also useful to label HRG encoding nucleic acid for use as a probe) is any detectable 
functionality that does not interfere with the binding of analyte and its binding partner. Numerous labels are known for 
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use in immunoassay, examples including moieties that may be detected directly, such as fluorochrome, chemilumines- 
cent, and radioactive labels, as well as moieties, such as enzymes, that must be reacted or derivatized to be detected. 
Examples of such labels include the radioisotopes ^^P, "'^C, ''251^ 3h, and I, fluorophores such as rare earth chelates 
orfluorescein and its derivatives, rhodamine and its derivatives, dansyl, umbelliferone, luciferases, e.g., firefly luciferase 

5 and bacterial luciferase (U.S. Pat. No. 4,737,456), luciferin, 2,3-dihydrophthalazinediones, horseradish peroxidase 
(HRP), alkaline phosphatase, p-galactosidase, glucoamylase, lysozyme, saccharide oxidases, e.g., glucose oxidase, 
galactose oxidase, and glucose-6-phosphate dehydrogenase, heterocyclic oxidases such as uncase and xanthine 
oxidase, coupled with an enzyme that employs hydrogen peroxide to oxidize a dye precursor such as HRP, iactoper- 
oxidase, or microperoxidase, biotin/avidin, spin labels, bacteriophage labels, stable free radicals, and the like. 

10 [0207] Conventional methods are available to bind these labels covalently to proteins or polypeptides. For instance, 
coupling agents such as dialdehydes, carbodiimides, dimaleimides, bis-imidates, bis-diazotized benzidine, and the like 
may be used to tag the antibodies with the above-described fluorescent, chemiluminescent, and enzyme labels. See, 
for example, U.S. Pat. Nos. 3,940,475 (fluorimetry) and 3,645,090 (enzymes); Hunter etai, Nature, 144:945 (1962); 
David etai. Biochemistry, 13:1014-1021 (1974); Pain etal., J, Immunol. Methods, 40:219-230 (1981); and Nygren, J. 

15 Histochem. and Cytochem., 30:407-412 (1 982). Preferred labels herein are enzymes such as horseradish peroxidase 
and alkaline phosphatase. The conjugation of such label, including the enzymes, to the antibody is a standard manip- 
ulative procedure for one of ordinary skill in immunoassay techniques. See, for example, O'SulIivan et at., "Methods 
for the Preparation of Enzyme-antibody Conjugates for Use in Enzyme Immunoassay," in Methods in Enzymology. ed. 
J.J. Langone and H. Van Vunakis, Vol. 73 (Academic Press, New York, New York, 1981), pp. 147-166. Such bonding 

20 methods are suitable for use with HRG or its antibodies, all of which are proteinaceous. 

[0208] Immobilization of reagents is required for certain assay methods. Immobilization entails separating the binding 
partner from any analyte that remains free in solution. This conventionally is accomplished by either insolubilizing the 
binding partner or analyte analogue before the assay procedure, as by adsorption to a water-insoluble matrix or surface 
(Bennich etai, U.S. Pat. No. 3,720,760), bycovalent coupling (for example, using glutaraldehyde cross-linking), or by 

25 insolubilizing the partner or analogue afterward, e.g., by immunoprecipitation. 

[0209] Other assay methods, known as competitive or sandwich assays, are well established and widely used in the 
commercial diagnostics industry. 

[021 0] Competitive assays rely on the ability of a tracer analogue to compete with the test sample analyte for a limited 
number of binding sites on a common binding partner. The binding partner generally is insolubilized before or after the 

30 competition and then the tracer and analyte bound to the binding partner are separated from the unbound tracer and 
analyte. This separation is accomplished by decanting (where the binding partner was preinsolubilized) or by centri- 
fuging (where the binding partner was precipitated after the competitive reaction). The amount of test sample analyte 
is inversely proportional to the amount of bound tracer as measured by the amount of marker substance. Dose-response 
curves with known amounts of analyte are prepared and compared with the test results to quantitatively determine the 

35 amount of analyte present in the test sample. These assays are called ELISA systems when enzymes are used as the 
detectable makers. 

[021 1] Another species of competitive assay, called a "homogeneous" assay, does not require a phase separation. 
Here, a conjugate of an enzyme with the analyte is prepared and used such that when anti-anatyte binds to the analyte 
the presence of the anti-analyte modifies the enzyme activity. In this case, HRG or its immunologically active fragments 
40 are conjugated with a bifunctional organic bridge to an enzyme such as peroxidase. Conjugates are selected for use 
with anti-HRG so that binding of the anti-HRG antibody inhibits or potentiates the enzyme activity of the label. This 
method per se is widely practiced under the name of EMIT 

[0212] Steric conjugates are used in steric hindrance methods for homogeneous assay. These conjugates are syn- 
thesized by covalently linking a low-molecular-weight hapten to a small analyte so that antibody to hapten substantially 

45 is unable to bind the conjugate at the same time as anti-analyte. Under this assay procedure the analyte present in 
the test sample will bind anti-analyte, thereby allowing anti-hapten to bind the conjugate, resulting in a change in the 
character of the conjugate hapten, e.g., a change in fluorescence when the hapten is a fluorophore. 
[0213] Sandwich assays particulariy are useful for the detemnination of HRG or HRG antibodies. In sequential sand- 
wich assays an immobilized binding partner is used to adsorb test sample analyte, the test sample is removed as by 

50 washing, the bound analyte is used to adsoriD labeled binding partner, and bound material is then separated from 
residual tracer. The amount of bound tracer is directly proportional to test sample analyte. In "simultaneous" sandwich 
assays the test sample is not separated before adding the labeled binding partner. A sequential sandwich assay using 
an anti-HRG monoclonal antibody as one antibody and a polyclonal anti-HRG antibody as the other is useful in testing 
samples for HRG activity. 

55 [021 4] The foregoing are merely exemplary diagnostic assays for HRG and antibodies. Other methods now or here- 
after developed for the determination of these analytes are included within the scope hereof, including the bioassays 
described above. 

[0215] HRG polypeptides may be used for affinity purification of receptors such as the p185"^^2 and other similar 
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receptors that have a binding affinity for HRG, and more specifically HRG-a, HRG-p1 , HRG-p2 and HRG-p3. HRG-a, 
HRG-pi , HRG-p2 and HRG-ps may be used to form fusion polypeptides wherein HRG portion is useful for affinity 
binding to nucleic acids and to heparin. 

[0216] HRG polypeptides may be used as ligands for competitive screening of potential agonists or antagonists for 
5 binding to p185HER2 hRG variants are useful as standards or controls in assays for HRG provided that they are 
recognized by the analytical system employed, e.g. an anti-HRG antibody. Antibody capable of binding to denatured 
HRG or a fragment thereof, is employed in assays in which HRG is denatured prior to assay, and in this assay the 
denatured HRG or fragment is used as a standard or control. Preferably. HRG-a, HRG-pi , HRG-p2 and HRG-p3 are 
detectably labelled and a competition assay for bound p1 85^^^^^ is conducted using standard assay procedures. 
10 [0217J The methods and procedures described herein with HRG-a may be applied similarly to HRG-p1, HRG-P2 
and HRG-p3 and to other novel HRG ligands and to their variants. The following examples are offered by way of 
illustration and not by way of limitation. 

EXAMPLES 

15 

[0218] Example 1 

Preparation of Breast Cancer Cell Supernatants 

20 [0219] Heregulin-a was isolated from the supematant of the human breast carcinoma MDA-MB-231 . HRG was re- 
leased into and isolated from the cell culture medium. 

a. Cell Culture 

25 [0220] MDA-MB-231 , human breast carcinoma cells, obtainable from the American Type Culture Collection (ATCC 
HTB 26), were initially scaled-up from 25 cm^ tissue culture flasks to 890 cm^ plastic roller bottles (Coming, Coming, 
N Y) by serial passaging and the seed train was maintained at the roller bottle scale. To passage the cells and maintain 
the seed train, flasks and roller bottles were first rinsed with phosphate buffered saline (PBS) and then incubated with 
trypsin/EDTA (Sigma, St. Louis, Mo) for 1-3 minutes at 37'*C. The detached cells were then pipetted several times in 

30 fresh culture medium containing fetal bovine serum (FBS), (Gibco, Grand Island, NY) to break up cell dumps and to 
inactivate the trypsin. The cells were finally split at a ratio of 1 :10 into fresh medium, transferred into new flasks or 
bottles, incubated at 37^*0, and allowed to grow until neariy confluent. The grovrth medium in which the cells were 
maintained was a combined DME/Ham's-F-1 2 medium formulation modified with respect to the concentrations of some 
amino acids, vitamins, sugars, and salts, and supplemented with 5% FBS. The same basal medium is used for the 

35 serum-free ligand production and is supplemented with 0.5% Primatone RL (Sheffield, Norwich, NY). 

b. Large Scale Production 

[0221] Large scale MDA-MB-231 cell growth was obtained by using Percell Biolytica microcarriers (Hyclone Labo- 

40 ratories, Logan, UT) made of weighted cross-linked gelatin. The microcarriers were first hydrated, autoclaved, and 
rinsed according to the manufacturer's recommendations. Cells from 10 roller bottles were trypsinized and added into 
an inoculation spinner vessel which contained three liters of growth medium and 10-20 g of hydrated microcarriers. 
The cells werestinred gently for about one hour and transferred into a ten-liter instrumented femnenter containing seven 
titers of growth medium. The culture was agitated at 65-75 rpm to maintain the microcan-iers in suspension. The fer- 

45 menter was controlled at 37**C and the pH was maintained at 7.0-7.2 by the addition of sodium carbonate and CO2. 
Air and oxygen gases were sparged to maintain the culture at about 40% of air saturation. The cell population was 
monitored microscopically with a fluorescent vital stain (fluorescein diacetate) and compared to trypan blue staining 
to assess the relative cell viability and the degree of microcarrier invasion by the cells. Changes in cell-microcanrier 
aggregate size were monitored by microscopic photography. 

50 [0222] Once the microcarriers appeared 90-100% confluent, the culture was washed with serum-free medium to 
remove the serum. This was accomplished by stopping the agitation and other controls to allow the carriers to settle 
to the bottom of the vessel. Approximately nine liters of the culture supernatant were pumped out of the vessel and 
replaced with an equal volume of serum-free medium (the same basal medium described as above supplemented 
either with or without Primatone RL). The microcarriers were briefly resuspended and the process was repeated until 

55 a 1 000 fold removal of FBS was achieved. The cells were then incubated in the serum-free medium for 3-5 days. The 
glucose concentration in the culture was monitored daily and supplemented with additions of glucose as needed to 
maintain the concentration in the femnenter at or above 1 g/L. At the time of harvest, the microcarriers were settled as 
described above and the supernatant was aseptically removed and stored at 2-8°C for purification. Fresh serum-free 
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medium was replaced into the fermenter, the microcarriers were resuspended, and the culture was incubated and 
harvested as before. This procedure could be repeated four tinnes. 

Example 2 

5 

Purification of Growrth Factor Activity 

[0223] Conditioned media (10-20 liters) from MDA-MB-231 cells was clarified by centrlfugation at 10,000 rpm in a 
Sorvall Centrifuge, filtered through a 0.22 micron filter and then concentrated 10-50 (approx. 25) fold with a Minitan 

10 Tangential Flow Unit (Millipore Corp.) with a 1 0 kDa cutoff polysutfone membrane at room temperature. Alternatively 
media was concentrated with a 2.5L Amicon Stirred Cell at 4°C with a YM3 membrane. After concentration, the media 
was again centrifuged at 10,000 rpm and the supernatant frozen in 35-50 ml aliquots at -SO^C. 
[0224] Heparin Sep h arose was purchased from Pharmacia (Piscataway, NJ) and was prepared according to the 
directions of the manufacturer. Five milliliters of the resin was packed Into a column and was extensively washed (100 

15 column volumes) and equilibrated with phosphate buffered saline (PBS). The concentrated conditioned media was 
thawed, filtered through a 0.22 micron filter to remove particulate material and loaded onto the heparin-Sepharose 
column at a flow rate of 1 ml / min. The norma! load consisted of 30-50 mis of 40-fold concentrated media. After loading, 
the column was washed with PBS until the absorbance at 280 nm returned to baseline before elution of protein was 
begun. The column was eluted at 1 ml/min with successive salt steps of 0.3 M, 0.6 M, 0.9 M and (optionally) 2.0 M 

20 NaCI prepared in PBS. Each step was continued until the absorbance returned to baseline, usually 6-10 column vol- 
umes. Fractions of 1 milliliter volume were collected. All of the fractions con-esponding to each wash or salt step were 
pooled and stored for subsequent assay in the MDA-MB-453 cell assay 

[0225] The majority of the tyrosine phosphorylation stimulatory activity was found in the 0.6M NaCI pool which was 
used for the next step of purification. Active fractions from the heparin-Sepharose chromatography were thawed, diluted 

25 three fold with delonized (MilliQ) water to reduce the salt concentration and loaded onto a polyaspartic acid column 
(PolyCAT A, 4.6 x 100 mm, PolyLC, Columbia, MD.) equilibrated in 17 mM Na phosphate, pH 6.8. All buffers for this 
purification step contained 30% ethanol to improve the resolution of protein on this column. After loading, the column 
was washed with equilibration buffer and was eluted with a linear salt gradient from 0.3 M to 0.6 M NaCI in 1 7 mM Na 
phospfiate, pH 6.8, buffer. The column was loaded and developed at 1 ml/min and 1 ml fractions were collected during 

30 the gradient elution. Fractions were stored at 4*0. Multiple heparin-Sepharose and PolyCat columns were processed 
in order to obtain sufficient material for the next purification step. A typical absoriDance profile from a PotyCat A column 
is shown in Figure 1 . Aliquots of 1 0-25 \iL were taken from each fraction for assay and SDS gel analysis. 
[0226] Tyrosine phosphorylation stimulatory activity was found throughout the eluted fractions of the PolyCAT A 
column with a majority of the activity found in the fractions corresponding to peak C of the chromatogram (salt con- 

35 centration of approximately 0.45M NaCI). These fractions were pooled and adjusted to 0.1% trifluoracetic acid (TFA) 
by addition of 0.1 volume of 1% TFA. Two volumes of deionized water were added to dilute the ethanol and salt from 
the previous step and the sample was subjected to further purification on high pressure liquid chromatography (HPLC) 
utilizing a C4 reversed phase column (SynChropak RP-4, 4.6 x100 mm) equilibrated in a buffer consisting of 0.1% TFA 
in water with 15% acetonitrile. The HPLC procedure was carried out at room temperature with a flow rate of 1 ml/min. 

40 After loading of the sample, the column was re-equilibrated in 0.1% TFA/15% acetonitrile. A gradient of acetonitrile 
was established such that over a 10 minute period of time the acetonitrile concentration increased from 15 to 25% 
(1%/min). Subsequently, the column was developed with a gradient from 25 to 40% acetonitrile over 60 min time 
(0.25%/min). Fractions of 1 ml were collected, capped to prevent evaporation, and stored at 4''C. Aliquots of 10 to 50 
^iL were taken, reduced to dryness under vacuum (SpeedVac), and reconstituted with assay buffer (PBS with 0.1% 

45 bovine serum albumin) for the tyrosine phosphorylation assay. Additionally, aliquots of 10 to 50 |j.L were taken and 
dried as above for analysis by SDS gel electrophoresis. A typical HPLC profile is shown in Figure 2. 
[0227] A major peak of activity was found In fraction 1 7 (Figure 2B). By SDS gel analysis, fraction 1 7 was found to 
contain a single major protein species which comigrated with the 45,000 dalton molecular weight standard (Figs. 2C, 
3). In other preparations, the presence of the 45,000 dalton protein comigrated with the stimulation of tyrosine phos- 

50 phorylation activity in the MDA-MB-453 cell assay The chromatographic properties of the 45,000 dalton protein were 
atypical; in contrast to many other proteins in the preparation, the 45,000 dalton protein did not etute from the reversed 
phase column within 2 or 3 fractions. Instead, it was eluted over 5-1 0 fractions. This is possibly due to extensive post- 
translational modifications. 

55 a. Protein Sequence Determination 

[0228] Fractions containing the 45,000 dalton protein were dried under vacuum for amino acid sequencing. Samples 
were redissolved in 70% formic acid and loaded into an Applied Biosystems, Inc. Model 470A vapor phase sequencer 
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for N-terminal sequencing. No discemable N-temnina! sequence was obtained, suggesting that the N-ternnlnal residue 
was blocked. Similar results were obtained when the protein was first run on an SDS gel, transblotted to ProBlott 
nnennbrane and the 45,000 dalton band excised after localization by rapid staining with Coomassie Brilliant Blue. 
[0229] Internal amino acid sequence was obtained by subjecting fractions containing the 45,000 dalton protein to 

5 partial digestion using either cyanogen bromide, to cleave at methionine residues, Lysine-C to cleave at the C-tenninal 
side of lysine residues, or Asp-N to cleave at the N-temninal side of aspartic acid residues. Samples after digestion 
were sequenced directly or the peptides were first resolved by HPLC chromatography on a Synchrom C4 column 
{4000A, 2 X 100 mm) equilibrated in 0.1% TFA and eluted with a 1-propanol gradient in 0.1% TFA. Peaks from the 
chromatographic run were dried under vacuum before sequencing. 

10 [0230] Upon sequencing of the peptide in the peak designated number 15 (lysine C-15), several amino acids were 
found on each cycle of the run. After careful analysis, it was clear that the fraction contained the same basic peptide 
with several different N-termini, giving rise to the multiple amino acids in each cycle. After deconvolution, the following 
sequence was determined (SEQ ID N0.3): 

[A]AEKEKTF(C)VNGGEXFMVKDLXNP 
1 5 10 15 20 

(Residues in brackets were uncertain while an X represents a cycle in which it was not possible to identify the amino 
20 acid.) 

[0231] The initial yield was 8.5 pmoles. This sequence comprising 24 amino acids did not correspond to any previ- 
ously known protein. Residue 1 was later found from the cDNA sequence to be Cys and residue 9 was found to be 
correct. The unknown amino acids at positions 15 and 22 were found to be Cys and Cys, respectively. 
[0232] Sequencing on samples after cyanogen bromide and Asp-N digestions, but without separation by HPLC, were 
25 performed to corroborate the cDNA sequence. The sequences obtained are given in Table I and confirm the sequence 
for the 45,000 protein deduced from the cDNA sequence. The N-terminal of the protein appears to be blocked with an 
unknown blocking group. On one occasion, direct sequencing of the 45,000 dalton band from a PVDF blot revealed 
this sequence with a very small initial yield (02 pmole)(SEQ ID N0:4): 

XEXKE{G)(R)GK(G)K(G)KKKEXGXG(K) 

(Residues which could not be determined are represented by "X", while tentative residues are in parentheses). This 
corresponds to a sequence starting at the serine at position 46 near the present N-terminal of HRG cDNA sequence; 
35 this suggests that the N tenninus of the 45,000 protein is at or before this point in the sequence. 

Examples 3 

Cloning and Sequencing of Human Heregulin 

40 

[0233] The cDNA cloning of the pISS^^^P'^ ugand was accomplished as follows. A portion of the lysine C-15 peptide 
amino acid sequence was decoded in order to design a probe for cDNA's encoding the 45kD HRG-a ligand. The 
following 39 residue long eightfold degenerate deoxyoligonucleotide corresponding to the amino acid sequence(SEQ 
ID N0:5) NH2-...AEKEKTFXVNGGE was chemically synthesized (SEQ ID N0:6): 

45 

3' GCTGAGAAGGAGAAGACCTTCT6T/CGTGAAT/CGGA/CGGCGAG 5'. 



50 The unknown amino acid residue designated by X in the amino acid sequence was assigned as cysteine for design of 
the probe. This probe was radioactively phosphorylated and employed to screen by low stringency hybridization an 
oligo dT primed cDNA library constructed from human MDA-MB-231 cell mRNA in Xgt10 (Huyng etal., 1984, In DNA 
Cloning, Vol 1: A Practical Approach (D. Glover, ed) pp.49-78. IRL Press, Oxford). Two positive clones designated 
Xgt10her16 and >.gt10her13 were identified. DNA sequence analysis revealed that these two clones were identical. 

55 [0234] The 2010 basepair cDNA nucleotide sequence of Xgt10her16 (Fig. 4) contains a single long open reading 
frame of 669 amino acids beginning with alanine at nucleotide positions 3-5 and ending with glutamine at nucleotide 
positions 2007-2009. No stop codon was found in the translated sequence; however, later analysis of heregulin p-type 
clones indicates that methionine encoded at nucleotide positions 135-137 was the initiating methionine. Nucleotide 
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sequence homology with the probe is found between and including bases 681-719. Honnology between those amino 
acids encoded by the probe and those flanking the probe with the amino acid sequence detemnined for the lysine C- 
15 fragment verify that the isolated clone encodes at least the lysine C-1 5 fragment of the 45kD protein. 
[0235] Hydropathy analysis shows the existence of a strongly hydrophobic amino acid region including residues 
5 287-309 (Fig. 4) indicating that this protein contains a transmembrane or internal signal sequence domain and thus is 
anchored to the membrane of the cell. 

[0236] The 669 amino acid sequence encoded by the 201 Obp cDNA sequence contains potential sites forasparagine- 
linked glycosylation (Winzler, R. in Hormonal Proteins and Peptides, (Li, CM. ed) pp 1-15 Academic Press, New York 
(1973)) at positions asparagine 164, 170, 208, 437 and 609. A potential O-giycosylation site (Marshall, R.D. (1974) 

10 Biochem. Soc. Symp. 40:1 7-26) is presented in the region including a cluster of serine and threonine residues at amino 
acid positions 209-218. Three sites of potential glycosaminoglycan addition (Goldstein, L.A., et aL (1989) Cell 56: 
1063-1072) are positioned at the serine-glycine dipeptides occurring at amino acids 42-43, 64-65 and 151-152. Glyc- 
osylation probably accounts for the discrepancy between the calculated NW of about 26KD for the NTD-GFD (extra- 
cellular) region of HRG and the observed NW of about 45 KD for purified HRG. 

15 [0237] This amino acid sequence shares a number of features with the epidermal growth factor (EGF) family of 
transmembrane bound growth factors (Carpenter.G., and Cohen, S, (1979) Ann. Rev. Biochem.48: 193-216; Mas- 
senque, J.(1990) J. Biol. Chem. 265: 21393-21396) including 1) the existence of aprofonn of each growth factor from 
which the mature fomn is proteolyticaliy released (Gray,A., Dull, T.J., and Ullrich, A. (1983) Nature 303, 722-725; Bell, 
G.I. etaL. (1986) Nuc. Acid Res., 14: 8427-8446; Derynck, R. et at. (1984) Cell: 287-297); 2) the conservation of six 

20 cysteine residues characteristically positioned -over a span of approximately 40 amino acids (the EGF-like structural 
motif) (Savage.R.C, etaL (1973) J. Biol. Chem. 248: 7669-7672); HRG-a cysteines 226, 234, 240, 254, 256 and 265 ); 
and, 3) the existence of a transmembrane domain occurring proximally on the carboxy-terminal side of the EGF ho- 
mologous region (Fig. 4 and 6). 

[0238] Alignment of the amino acid sequences in the region of the EGF motif and flanking transmembrane domain 

25 of several human EGF related proteins (Fig. 6) shows that between the first and sixth cysteine of the EGF motif HRG 
is most similar (50%) to the heparin binding EGF-like growth factor (HB-EGF) (Higashiyama, S. etal. (1991) Science 
251 : 936-939). In this same region HRG is -35% homologous to amphiregulin (AR) (Plowman, G.D. etal., (1990) Mol. 
Cell. Biol. 1 0: 1 969-1 981 ). -32% homologous to transforming growth factor a (TGF a) (8), 27% homologous with EGF 
(Bell, G.I. etal., (1986) Nuc. Acid Res., 14: 8427-8446); and 39% homologous to the schwanoma-derived growth factor 

30 (Kimura, H.. et ai. Nature, 348:257-260, 1990). Disulfide linkages between cysteine residues in the EGF motif have 
been determined for EGF (Savage, R.C. et al. (1973) J. Biol. Chem. 248: 7669-7672). These disulfides define the 
secondary structure of this region and demarcate three loops. By numbering the cysteines beginning with 1 on the 
amino-temninal end, loop 1 is delineated by cysteines 1 and 3; loop 2 by cysteines 2 and 4; and loop 3 by cysteines 5 
and 6. Although the exact disulfide configuration in the region for the other members of the family has not been deter- 

35 mined, the strict conservation of the six cysteines, as well as several other residues i.e., glydne 238 and 262 and 
arginine at position 264, indicate that they too most likely have the same arrangement. HRG-a and EGF both have 13 
amino acids in loop 1. HB-EGF, amphregulin (AR) and TGF a have 12 amino acids in loop 1. Each member has 10 
residues in loop 2 except HRG-a which has 13. All five members have 8 residues in the third ioop. 
[0239] EGF, AR, HB-EGF and TGF-a are all newly synthesized as membrane anchored proteins by virtue of their 

40 transmembrane domains. The proproteins are subsequently processed to yield mature active molecules. In the case 
of TGF-a there is evidence that the membrane associated profomns of the molecules are also biologically active (Brach- 
mann, R.,etal. (1989) Cell 56: 691-700), a trait that may also be the case for HRG-a. EGF Is synthesized as a 1168 
amino acid transmembrane bound proEGF that is deaved on the am I no -terminal end between arginine 970 and as- 
paragine 971 and at the carboxy-terminal end between arginine 1023 and histidine 1024 (Carpenter,G., and Cohen, 

45 s. (1979) Ann. Rev. Biochem-48: 193-216) to yield the 53 amino acid mature EGF molecule containing the three loop, 
3 disulfide bond signature structure. The 252 amino acid proAR is cleaved between aspartic acid 100 and serine 101 
and between lysine 184 and serine 185 to yield an 84 amino acid fonn of mature AR and a 78 amino acid form is 
generated by NHg-terminat cleavage between glutamine 106 and valine 107 (Plowman, G.D. etal, (1990) Mol. Cell. 
Biol. 10: 1969-1981). HB-EGF is processed from its 208 amino acid primary translation product to its proposed 84 

50 amino acid fonn by cleavage between arginine 73 and valine 74 and a second site approximately 84 amino acids away 
in the carboxy-tenninal direction (Higashiyama, S., ef ai, and Klagsbum, M, (1991) Science 251: 936-939). The 160 
amino acid preform of TGF a is processed to a mature 50 amino acid protein by cleavages between alanine 39 and 
valine 40 on one side and downstream cleavage between alanine 89 and valine 90 (Derynck et ai, (1984) Cell: 38: 
287-297). For each of the above described molecules COOH-tenninal processing occurs in the area bounded by the 

55 sixth cysteine of the EGF motif and the beginning of the transmembrane domain. 

[0240] The residues between the first and sixth cysteines of HRGs are most similar (45%) to heparin-binding EGF- 
like growth factor (HB-EGF). In this same region they are 35% identical to amphiregulin (AR), 32% identical to TGF- 
a, and 27% identical with EGF. Outside of the EGF motif there is little similarity between HRGs and other members of 
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the EGF family. EGF, AR. HB-EGF and TGF-a are all derived from membrane anchored proproteins which are proc- 
essed on both sides of the EGF structural unit, yielding 50-84 amino acid mature proteins (16-19). Like other EGF 
family members, the HRGs appear to be derived from a membrane-bound proform but require only a single cleavage, 
C-terminal to the cysteine cluster, to produce mature protein. 

5 [0241] HRG may exert its biological function by binding to its receptor and triggering the transduction of a growth 
modulating signal This it may accomplish as a soluble molecule or perhaps as its membrane anchored form such as 
is sometimes the case with TGF a (Brachmann, R., et ai, (1989) Cell 56: 691-700). Conversely, or in addition to 
stimulating signal transduction, H RG may be internalized by a target cell where it may then interact with the controlling 
regions of other regulatory genes and thus directly deliver its message to the nucleus of the cell. The possibility that 

10 HRG mediates some of its effects by a mechanism such as this is suggested by the fact that a potential nuclear location 
signal (Roberts, Biochem-Biophys Acta (1989) 1 008: 263-280) exists in the region around the three lysine residues at 
positions 58-60 (Fig. 4). 

[0242] The isolation of full-length cDNA of HRG-a is accomplished by employing the DNA sequence of Fig 4 to select 
additional cDNA sequences from the cDNA library constructed from human MDA-MB-231. Full-length cONA clones 

15 encoding HRG-a are obtained by identifying cDNAs encoding HRG-a longer in both the 3' and 5' directions and then 
splicing together a composite of the different cDNAs. Additional cDNA libraries are constructed as required for this 
purpose. Following are three types of cDNA libraries that may be constructed: 1 ) Oligo-dT primed where predominately 
stretches of polyadenosine residues are primed, 2) random primed using short synthetic deoxyoligonucleotides non- 
specific for any particular region of the mRNA, and 3) specifically primed using short synthetic deoxyoligonucleotides 

20 specific for a desired region of the mRNA. Methods for the isolation of such cDNA libraries were previously described. 

Example 4 

Detection of HRG-g mRNA Expression by Northern Analyses 

25 

[0243] Northern blot analysis of MDA-MB-231 and SK-BR-3 cell mRNA under high stringency conditions shows at 
least five hybridizing bands in MDA-MB-231 mRNA where a 6.4Kb band predominates: other weaker bands are at 9.4, 
6.9, 2.8 and 1 .8Kb (Fig. 5). No hybridizing band is seen in SK-BR-3 mRNA (this cell line overepresses p^B5^^^^}, The 
existence of these multiple messages in MDA-MB-231 cells indicates either alternative splicing of the gene, various 
30 processing of the genes' primary transcript or the existance of a transcript of another homologous message. One of 
these messages may encode a soluble non-transmembrane bound fomn of HRG-a. Such messages (Fig. 5) may be 
used to produce cDNA encoding soluble non-transmembrane bound forms of HRG-a. 

Example 5 

35 

Cell Growth Stimulation by Heregulin-g 

[0244] Several different breast cancer cell lines expressing the EGF receptor or the p1 85*^^^ receptor were tested 
for their sensitivity to growth inhibition or stimulation by ligand preparations. The cell lines tested were: SK-BR-3 (ATCC 

40 HTB 30), a cell line which overexpresses p185H^f^; MDA-MB-468 (ATCC HTB 132), a line which overexpresses the 
EGF receptor; and MCF-7 cells (ATCC HTB 22) which have a moderate level of p1Q5^^^^ expression. These cells 
were maintained in culture and passaged according to established cell culture techniques. The cells were grown in a 
1:1 mixture of DM EM and F-1 2 media with 1 0% fetal bovine serum. For the assay, the stock cultures were treated with 
trypsin to detach the cells from the culture dish, and dispensed at a level of about 20000 cells/well in a ninety-six well 

45 microtiter plate. During the course of the growth assay they were maintained in media with 1 % fetal bovine serum. The 
test samples were sterilized by filtration through 0.22 micron fitters and they were added to quadruplicate wells and 
the cells incubated for 3-5 days at 37"C. At the end of the growth period, the media was aspirated from each well and 
the cells treated with crystal violet (Lewis, G. et ai, Cancer Research, 347:5382-5385 [1987]). The amount of crystal 
violet absorbance which is proportional to the number of cells in each well was measured on a Flow Plate Reader. 

so Values from replicate wells for each test sample were averaged. Untreated wells on each dish served as controls. 
Results were expressed as percent of growth relative to the control cells. 

[0245] The purified HRG-a ligand was tested for activity in the cell growth assay and the results are presented in 
Figure 7. At a concentration of approximately 1 nM ligand, both of the cell lines expressing the pi 85^^^^^ receptor (SK- 
BR-3 and MCF-7) showed stimulation of growth relative to the controls while the cell type (MDA-MB-468) expressing 
55 only the EGF receptor did not show an appreciable response. These results were consistent to those obtained from 
the autophosphorylation experiments with the various cell lines. These results established that HRG-a ligand is specific 
for the p^B5^^^^ receptor and does not show appreciable interaction with the EGF receptor at these concentrations. 
[0246] HRG does not compete with antibodies directed against the extra-cellular domain of p185^^E'^2^ but anti- 
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p185HER2 Mabs 2C4 and 7F3 (which are anttprel iterative in their own right) do antagonize HRG. 
Example 6 

5 Cloning and Sequencing of Heregulin-pl 

[0247] The isolation of HRG-p1 cDNA was acconnplished by ennploying a hybridizing fragment of the DNA sequence 
encoding HRG-a to select additional cDNA sequences from the cDNA library constructed from human MDA-MB-231 
cells. Clone Xher1 1 .1 dbl (heregulin-pi ) was identified in a Xgi^o oligo-dT primed cDNA library derived from MDA MB231 

10 polyA+ mRNA. Radioactively labelled synthetic DNA probes corresponding to the 5' and 3' ends of ^er16 (HRG-a) 
were employed in a hybridization reaction under high stringency conditions to isolate the A,her1 1 .1dbl clone. The DNA 
nucleotide sequence of the Wier11.1dbl clone is shown in figure 8 (SEQ ID NO:9) HRG-fH amino acid sequence is 
homologous to HRG-a from its amino-terminal end at position Asp 15 of HRG-a through the 3'end of HRG-a except 
at the positions described below. In addition. HRG-pi encoding DNA extends 189 base pairs longer than AJier16 in 

15 the 3' direction and supplies a stop codon after Val 675. At nucleotide position 247 of Alierl 1 .1 dbl there is a G substituted 
for A thereby resulting in the substitution of Gln(Q) in place of Arg(R) in HRG-pi as shown in the second line of Figure 
9 (SEQ ID N0:8 and SEQ ID NO:9). 

[0248] In the area of the EGF motif there are additional differences between HRG-a and HRG-p1 . These differences 
are illustrated below in an expanded view of the homology between HRG-a and HRG-p1 in the region of the EGF motif 
20 or the GFD (growth factor domain). The specific sequence shown corresponds to HRG-a amino acids 221-286 shown 
in figure 9. Asterisks indicate identical residues in the comparison below (SEQ ID NO:10 and SEQ ID NO;11). 



25 HEREGUUN-a SHr.VKCAEKEKTPCVWGGEC 

HEREGUUN-a pmvkdlsnpsrylckcqjgf 
30 HEREGULIN-pi «<»i>**«**ft*ttiio«t*-frj>jjE* 

-HEREGUUN-a tgarctenvpmkvqwqbk - - 

HEREGUUN-Pl **D**QNy*MASFYKHLGIB 

35 

HEREGUUN-a - -- AEELYQKR (-Transmembrane) 
HEREGUUN-pi fme^******** (-Transmembrane) 

40 

Example 7 

Expression of Heregullns in E. Coli 

45 [0249] HRG-a and HRG-pi have been expressed in E. coli using the DNA sequences of Figures 4 and 8 encoding 
heregulin under the control of the alkaline phosphatase promoter and the STll leader sequence. In the initial charac- 
terization of heregulin activity, the precise natural amino and cartDoxy termini of the heregulin molecule were not pre- 
cisely defined. However, after comparsion of heregulin to EGF and TGF-a sequences, we expected that shortened 
forms of heregulin starting around Ser 221 and ending around Glu 277 of figure 4 may have biological activity Analogous 

50 regions of all heregulins may be identified and expressed. One shortened fonri was constructed to have an N-teminal 
Asp residue followed by the residues 221 to 277 of HRG-a. Due to an accidental frame shift mutation following Glu 
277, HRG-a sequence was extended by 13 amino acids on the carboxy temninal end. Thus, the cariaoxy-terminal end 
was Glu 277 of HRG-a followed by the thirteen amino acid sequence RPNARLPPGVFYC (SEQ ID NO:20). 
[0250] Expression of this construct was induced by growth of the cells in phosphate depleted medium for about 20 

55 hours. Recombinant protein was purified by harvesting cell paste and resuspending in 1 0 mM Tris (pH8), homogenizing, 
incubating at 4*0. for 40 minutes and followed by centrifuging at 1 5 K rpm (Sorvall). The supernatant was concentrated 
on a 30K ultrafiltration membrane (Amicon) and the filtrate was applied to a MonoQ column equilibrtated in 1 0 mM Tris 
pH8. The flow-through fractions from the MonoQ column were adjusted to 0.05% TFA (trifluoroacetic acid) and sub- 



34 



EP1 114 863 A2 



jected to C4 reversed phase HPLC. Elution was with a gradient of 10-25% acelonitrile in 0.1% TFA/H2O. The solvent 
was removed by lyophilization and purified protein was resuspended in 0.1% bovine serum albumin in phosphate 
buffered saline. Figure 10 depicts HER2 receptor autophophorylation data with MCF-7 cells in response to the purified 
E. coli-derived protein. This material demonstrated full biological activity with an EC50 of 0.8 nM. The purified material 

5 was also tested in the eel! growth assays (Example 5) and was found to be a potent stimulator of cell growth. 

[0251] The recombinant expression vector for synthesis of HRG-pi was constructed in a manner similar to HRG-a. 
The expression vector contained DNA encoding HRG-pi amino acids from Sergoy through Leujya (Figure 4). This DNA 
encoding HRG-p1 was recombinantly spliced into the expression vector downstream from the alkaline phosphatase 
promoter and STII leader sequence. An additional serine residue was spliced on the carboxy temninus as a result of 

10 the recombinant construction process. The expression vector encoding HRG-p1 was used to transfonn E. coli and 
expressed in phosphate depleted medium. Induced E. coil were pelleted, resuspended in lOmM Tris (pH7.5) and 
sonicated. Cell debris was pelleted by centrifugation and the supernatant was filtered through a sterile filter before 
assay. The expression of HRG-p1 was confirmed by the detection of protein having the ability to stimulate autophos- 
phorylation of the HER2 receptor in MCF-7 cells. 

15 [0252] A similar expression vector was constructed as described for HRG-p1 (above) with a C tenminal tyrosine 
residue instead of the serine residue. This vector was transfonmed into E. coli and expressed as before. Purification 
of this recombinant protein was achieved as described for recombinant HRG-a. Mass spectrometric analysis revealed 
that the purified protein consisted of fomns which were shorter than expected. Amino acid sequencing showed that the 
protein had the desired N-terminal residue (Ser) but it was found by mass spectrometry to be truncated at the C terminus 

20 The majority (>80%) of the protein consisted of a form 51 amino acids long with a C terminal methionine (MET 271) 
(SEQ ID N0:9). A small amount of a shorter form (49 residues) truncated at VAL 269 was also detected. However, 
both the shortened forms showed full biological activity in the HER2 receptor autophosphorylation assay 

Example 8 

25 

ISOLATION OF HEREGULIN P2 and gS VARIANTS 

[0253] Heregulin-p2 and -p3 variants were isolated in order to obtain cDNA clones that extend further in the 5' di- 
rection. A specifically primed cDNA library was constructed in Xgtl 0 by employing the chemically synthesized antisense 

30 primer 3' CCTTCCCGTTCTTCTTCCTCGCTCC (SEQ ID N0:21). This primer located between nucleotides 167-190 
in the sequence of ^IherlB (figure 4). The isolation of clone A5'her13 (not to be confused with XherlS) was achieved 
by hybridizing a synthetic DNA probe corresponding to the 5'end of ^her16 under high stringency conditions with the 
specifically primed cDNA library. The nucleotide sequence of X5'her13 is shown in figure 11 (SEQ ID NO;22). The 496 
base pair nucleotide sequence of ^'her13 is homologous to the sequence of Xherl 6 between nucleotides 309-496 of 

35 X5'her13 and 3-190 of >,her16. /5'her13 extends by 102 amino acids the open reading frame of ^er16. 

[0254] The isolation of variant heregulin-p fomns was accomplished by probing a newly prepared oligodT primed 
kgt^O MDA-MB-231 mRNA-derived cDNA library with synthetic probes corresponding to the 5' end of X5'her13 and 
the cysteine rich EGF-like region of X.her16. Three variants of heregulin-p were identified, isolated and sequenced. 
The amino acid homologies between all heregulins is shown in figure 15 (SEQ ID NOS:26-30). 

40 [0255] HRG polypeptides A,her76 (heregulin-p2) (SEQ ID NO:23), Xher78 (heregulin-p3) (SEQ ID NO:24) and Xher84 
(heregulin p2-like) (SEQ ID NO:25) are considered variants of Wierll.1 dbl) (heregulin-pi) because although the de- 
duced amino acid sequence is identical between cysteine 1 and cysteine 6 of the EGF-like motif their sequences 
diverge before the predicted transmembrane domain which probably begins with amino acid 248 in Xherll .Idbl. 
[0256] The nucleotide sequences and deduced amino acid sequences of Xher76, A,her78 and Xher84 are shown in 

45 figures 12, 13 and 14. 

[0257] The variants each contain a TGA stop codon 148 bases 5' of the first methionine codon in their sequences. 
Therefore the ATG codon at nucleotide position 135-137 of A^erl 6 and the corresponding ATG in the other heregulin 
clones may be defined as the initiating methionine (amino acid 1). Clones Xherll .1dbl, Xher76, Xher84 and A.her78 all 
encode glutamine at amino acid 38 (Figure 15) whereas clone herl 6 encodes arginine (Figure 4, position 82), 

50 [0258] The deduced amino acid sequence of Aher76 (heregulin-pi) reveals a full-length clone encoding 637 amino 
acids. It shares an identical deduced amino acid sequence as A,her1 1 . 1 dbl except that residues corresponding to amino 
acids 232-239 of X,her11 .1 dbl have been deleted. The deduced amino acid sequence of >iier84 shows that it posesses 
the same amino acid sequence as >,her76 from the initiating methionine (amino acid 1 , Figure 15) through the EGF- 
like area and transmembrane domain. However, XherSA- comes to an early stop codon at arginine 421 (^her84 num- 

55 bering). Thereafter the 3' untranslated sequence diverges. The deduced amino acid sequence of Xher78 (heregulln- 
p3) is homologous with heregulins-p^ and -p^ through amino acid 230 where the sequence diverges for eleven amino 
acids then terminates. Thus heregulin-p^ has no transmembrane region. The 3' untranslated sequence is not homol- 
ogous to the other clones. 
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Example 9 

EXPRESSION OF HEREGULIN 3 FORMS 

5 [0259] In order to express heregulin-p forms in mammalian cells, full-length cDNA nucleotide sequences from A.her76 
(heregu!in-p2) or Aiier84 were subcloned into the mammalian expression vector pRK5.1 . This vector is a derivative of 
pRK5 that contains a cytomegalovims promoter followed by a 5' intron, a cloning polylinker and an &V40 early polya- 
denylation signal. COS7, monkey or human kidney 293 cells were transfected and conditioned medium was assayed 
in the MCF-7 cell p185/her2 autophosphorylation assay. A positive response confirmed the expression of the cDNA's 

10 from A^er76 (heregulin-p2) and ^her84 (heregultn-p3). 

[0260] Supematants from a large scale transient expression experiment were concentrated on a YM1 0 membrane 
(Amicon) and applied to a heparin Sepharose column as described in Example 1 . Activity {tyrosine phosphorylation 
assay) was detected in the 0.6M NaCI elution pool and was further purifed on a polyaspartic acid column, as previously 
described By SDS gel analysis and activity assays, the active fractions of this column were highly purified and contained 

15 a single band of protein with an apparent molecular weight of 45,000 daltons. Thus, the expressed protein has chro- 
matographic and structural properties which are very similar to those of the native fomn of heregulin originally isolated 
from the MDA 231 cells. Small scale transient expression experiments with constructs made from >,her84 cDNA also 
revealed comparable levels of activity in the cell supernatants from this variant form. The expression of the transmem- 
brane-minus variant, heregulin-p3, is currently under investigation. 

20 

Example 10 

[0261] proHRG-a and proHRG-p., cDNAs were spliced into Epstein Ban- virus derived expression vectors containing 
a cytomegalovirus promoter. rHRGs were purified {essentially as described in Example 2) from the seaim free condi- 

25 tioned medium of stably transfected CEN4 cells [human kidney 293 cells (ATCC No. 1573) expressing the Epstein 
Barr virus EBNA-1 transactivator. In other experiments full length proHRG-ct, -p^ and -p2 transient expression constructs 
provided plSSi^^f^^ phosphorylation activity in the conditioned medium of transfected C0S7 monkey kidney cells. How- 
ever, similar constructs of full length proHRG-p3 failed to yield activity suggesting that the hydrophobic domain missing 
in proHRG-p3 but present in the other proHRGs is necessary for secretion of mature protein. Truncated versions of 

30 proHRG-a (63 amino acids, serin 177 to tyrosine 239) and proHRG-p., (68 amino acids, serine 177 to tyrosine 241) 
each encoding the GFD structural unit and immediate flanking regions were also expressed in E. coli; homologous 
truncated versions of HRG-P3 are expected to be expressed as active molecules. These truncated proteins were purified 
from the periplasmic space and culture broth of E. coli. transformed with expression vectors designed to secrete re- 
combinant proteins (C.N. Change, M. Rey, B. Bochenr, H. Heyneker, G. Gray, Gene, 55:189 [1987]). These proteins 

35 also stimulated tyrosine phosphorylation of p185H^^2 but not p^07^^^^, indicating that the biological activity of HRG 
resides in the EGF-like domain of the protein and that carbohydrate moieties are not essential for activity in this assay. 
The NTD does not inhibit or suppress this activity. 

Example 11 

40 

[0262] Various human tissues were examined for the presence of HRG mRNA. Transcripts were found in breast, 
ovary, testis, prostate, heart, skeletal muscle, lung, liver, kidney, salivary gland, small intestine, and spleen but not in 
stomach, pancreas, uterus or placenta. While most of these tissues display the same three classes of transcripts as 
the MDA-MB-231 cells (6.6 kb. 2.5 kb and 1.8 kb), only the 6.6 kb message was observed for in heart and skeletal 
45 muscle. In brain a single transcript of 2.2 kb is observed and in testis the 6.6 kb transcript appears along with others 
of 22 kb, 1 .9 kb and 1 .5 kb. The tissue specific expression pattern observed for HRG differs from that of pISS'^^^; for 
example, adult liver, spleen, and brain contain HRG but not pi BS^^^f^^ transcripts whereas stomach, pancreas, uterus 
and placenta contain p185^^'^2 transcripts but lack HRG mRNA. 

50 



55 



36 



EP1 114 863 A2 



SEQUENCE LISTING 

(1) GENERAL INFORMATION: 
5 (i) APPLICANT: Genentech, Inc. 

(ii) TITLE OF INVENTION: Structure, Production and Use of Heregulin 
(iix) NUMBER OF SEQUENCES: 30 

10 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Genentech, Inc. 

(B) STREET: 460 Point Seal Bruno Blvd 

(C) CITY: South San Francisco 

(D) STATE: California 

(E) COUNTRY: USA 
^5 (F) ZIP: 94080 

(v) C0MPUTE31 READABLE FORM: 

(A) MEDIUM TYPE: 5.25 inch, 360 Kb floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS/MS-DOS 

(D) SOFTOARE: pat in (Genentech) 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 21-May-1992 

(C) CLASSIFICATION: 

25 

<vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: ll-May-1992 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/847743 

(B) FILING DATE: 06-Mar-1992 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/705256 

(B) FILING DATE: 24-May-1991 

35 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/765212 
(B> FILING DATE: 25-Sep-1991 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/790801 

(B) FILING DATE: 08-Nov-1991 

(Viii) ATTORNEY /A{3ENT INFORMATION: 

(A) NAME: Hensley, Max D. 

(B) REGISTRATION NUMBER: 27,043 

(C) REFERENCE /DOCKET NUMBER: 712P4 

(ix) TELECCa^MUNICATION INFORMATION: 

(A) . TELEPHONE; 415/266-1994 

(B) TELEFAX: 415/952-9881 

(C) TELEX: 910/371-7168 
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10 



20 



30 



50 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 bases 

(B) TyPE: nucleic acid 
iC) STRANDEDNESS ; single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
CNCAAT 6 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
AATAAA 6 

(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 amino acids 

(E) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO:3: 

Ala Ala Clu Lys Glu Lys Itir Phe Cys Val Asn Gly Gly Glu Xaa 
15 10 15 

Phe Ket Val Lys Asp Leu Xaa Asn Pro 
20 24 

(2) INFORMATION FOR SEQ ID NO: 4: 

(1) SEQUENCE CHARACTERISTICS: 

(A) I.Q9GTH: 21 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Xaa Glu Xaa Lys Clu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys 
15 10 15 

Glu Xaa Gly Xaa Gly Lys 
20 21 
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(2) INFORMATION FOR SEQ ID NO;5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: funino acid 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Ala Clu Lye Glu Lys Thr Phe Xaa Val Asn Gly Gly Glu 
1 5 10 13 

(2) INFORMATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 bases 

(B) TYPE: nucleic acid 

(C) STRANDEMJESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQXraa^CE DESCRIPTION: SEQ ID NO: 6: 

GCTGAGAAGG AGAAGACCTT CTGTCGTGAA TCOGACGGCG AG 42 



(2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 
(A> LENGTH: 2199 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



GG GAC AAA CTT TTC CCA AAC CCG ATC CGA GCC CTT GGA 38 
Asp Lys Leu Phe Pro Asn Pro lie Arg Ala Leu Gly 
15 10 

CCA AAC TCG CCT GCG CCG AGA GCC GTC CGC CTA GAG CGC 77 
Pro Asn Ser Pro Ala Pro Arg Ala Val Arg Val Glu Arg 
15 20 25 

• « 

TCC GTC TCC GGC GAG ATG TCC GAG CGC AAA GAA GGC AGA 116 
Ser Val Ser Gly Glu Met Ser Glu Arg Lye Glu Gly Arg 
30 3S 

GGC AAA GGG AAG GGC AAG AAG AAG GAG CGA GGC TCC GGC 155 
Gly Lys Gly Lys Gly Lys Lys Lys Glu Arg Gly Ser Gly 
40 45 SO 

AAG AAG CCG GAG TCC GCG GCG GGC AGC CAG AGC CCA GCC 194 
Lys Lys Pro Glu Ser Ala Ala Gly Ser Gin Ser Pro Ala 
55 60 

TTG CCT CCC CAA TTG AAA GAG ATG AAA AGC CAG GJ^ TCG 233 
Leu Pro Pro Gin Leu Lys Glu Met Lys Ser Gin Glu Ser 
65 70 75 
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55 



GCT GCA CGT TCC AAA CTA GTC CTT CGC TGT GAA ACC AGT 272 
Ala Ala Gly Ser Lys Deu Val heu Arg Cys Clu Thr Ser 
80 85 90 

TCT GAA TAC TCC TCT CTC AGA TTC AAG TGG TTC AAG AAT 311 
Ser Glu Tyr Ser Ser Leu Arg Phe hys Trp Ph© Lys Asn 
95 100 

GGG AAT GAA TTC AAT CGA AAA AAC AAA CCA CAA AAT ATC 350 
Gly Asn Glu LeU Asn Arjy Lye Aen Lys Pro Gin Asn lie 
105 110 115 

AAG ATA CAA AAA AAC CCA GGG AAG TCA GAA CTT CGC ATT 389 
Lys lie Gin Lys Lys Pro Gly Lys Ser Glu Leu Arg lie 
120 125 

AAC AAA GCA TCA CTG GCT GAT TCT GCA GAG TAT ATG TGC 428 
Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys 
130 135 140 

AAA GTG ATC AGO AAA TTA GGA AAT GAC AGT GCC TCT GCC 467 
Lys Val lie Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala 
145 * 150 155 

AAT ATC ACC ATC GTG GAA TCA AAC GAC ATC ATC ACT GGT 506 
Asn lie Thr lie Val Glu Ser Asn Glu lie He Thr Gly 
IGO 165 

ATG CCA GCC TCA ACT GAA GGA GCA TAT GTG TCT TCA GAG 545 
Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser Glu 
170 - 175 180 

TCT CCC ATT AGA ATA TCA GTA TCC ACA GAA GGA GCA AAT 584 
Ser Pro He Arg He Ser Val Ser mir Glu Gly Ala Asn 
185 190 

ACT TCT TCA TCT ACA TCT ACA TCC ACC ACT GGG ACA AGC 623 
Thr Ser Ser Ser Thr Ser Hhr Ser Thr Thr Gly Thx Ser 
195 200 205 

CAT CTT GTA AAA TGT GCG GAG AAG GAG AAA ACT TTC TGT 662 
His Leu Val Lys Cys Ala Glu Lys Glu Lys Ttir Phe Cys 
210 215 220 

GTG AAT GGA GGG GAG TGC TTC ATG GTG AAA GAC CTT TCA 701 
Val Asn Gly Gly Glu Cys Phe Met Val Lye Asp Leu Ser * 
225 230 

AAC CCC TCG AGA TAC TTG TGC AAG TC5C CCA AAT GAG TTT 740 
Asn Pro Ser Arg Tyr Leu Cys Lys Cys Pare Asn Glu Phe 
235 240 245 

ACT GGT GAT CGC TGC CAA AAC TAC GTA ATG GCC AGC TTC 775 
Thr Gly Asp Arg Cys Gin Asn Tyr Val Mat Ala Ser Phe 
250 255 

TAC AAG CAT CTT GGG ATT GAA TIT ATG GAG GCG GAG GAG 813 
Tyr Lvs His Leu Gly He Glu Phe Met Glu Ala Glu GJ,u 
260 ' 265 270 

CTG TAC CAG AAG AGA GTG CTG ACC ATA ACC GGC ATC TGC 857 
Leu T^'r Gin Lvs Arg Val Leu Thr He Thr Gly He Cys 
275 280 285 
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ATC GCC CTC CTT GTC CTC GGC ATC ATC TCT CTG CTG GCC 896 
He Ala Leu Leu Val Val Gly He Met Cys Val Val Ala 
290 295 

TAG T3C AAA ACC AAG AAA CAG CGG AAA AAG CTG CAT GAC 935 
Tyr Cys Lys Thr Lys Lys Gin Arg Lys Lys Leu His Asp 
300 305 310 

CGT CTT CGG CAG ACC CTT CGG TCT GAA CGA AAC AAT ATG 974 
Arg Leu Arg Gin Ser Leu Arg Ser Glu Arg Asn Aan Met 
315 320 

ATG AAC ATT GCC AAT GGG CCT CAC CAT CCT AAC CCA CCC 1013 
Met Asn lie Ala Asn Gly Pro His His Pro Asn Pro Pro 
325 330 335 

CCC GAG AAT GTC CAG CTG GTG AAT CAA TAC GTA TCT AAA 1052 
Pro Glu Asn Val Gin L^eu Val Asn Gin nyr Val Ser Lys 
340 345 350 

AAC GTC ATC TCC AGT GAG CAT ATT GTT GAG AGA GAA CCA 1091 
Acn Val Ho Ser Ser Glu His He Val Glu Arg Glu Ala 
355 360 

GAG ACA TCC TTT TCC ACC AGT CAC TAT ACT TCC ACA GCC 1130 
Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala 
365 370 375 

CAT CAC TCC ACT ACT GTC ACC CAG ACT CCT AGC CAC AGC 1169 
His His Ser Thr Thr Val Thr Gin Thr Pro Ser His Ser 
380 385 

TGG AGC AAC GGA CAC ACT GAA AGC ATC CTT TCC GAA AGC 1208 
Trp Ser Asn Gly His Thr Glu Ser He I^u Ser Glu Ser 
390 395 400 

CAC TCT GTA ATC GTG ATG TCA TCC GTA GAA AAC AGT AGG 1247 
His Ser Val He Val Mot Ser Ser Val Glu Asn Ser Arg 
405 410 415 

CAC ACC AGC CCA ACT GGG GGC CCA AGA GGA CGT CTT AAT 1286 
His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg Leu Asn 
420 425 

GGC ACA GGA GGC CCT CGT GAA TGT AAC AGC TTC CTC AGG 1325 
Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Lbu Alrg 
430 435 440 

CAT CCC AGA GAA ACC CCT GAT TCC TAC CGA GAC TCT CCT 1364 
His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro 
445 450 

CAT AGT GAA AGG TAT GTG TCA GCC ATG ACC ACC CCG GCT 1403 
His Ser Glu Arg Tyr Val Ser Ala Met Thr Thr Pro Ala 
455 460 465 

CGT ATG TCA CCT GTA GAT TTC CAC ACC CCA AGC TCC CCC 1442 
Arg Met Ser Pro Val Asp Phe His Thr Pro Ser Se^ Pro 
470 475 480 

AAA TCG CCC CCT TCG GAA ATG TCT CCA CCC GTG TCC AGC 1481 
Lys Ser Pro Pro Ser Glu Met Ser Pro Pro Val Ser Ser 
465 490 
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MTO ACG GTG TCC ATG CCT TCC ATG GCG GTC AGC CCC TTC X520 
Mat Thr V5I Ser Met Pro Ser Met Ala Va.1 Sor Pro Phe 
495 500 SOS 

ATG GAA GAA GAG AGA CCT CTA CTT CTC GTG ACA CCA CCA 1559 
Met Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro 
510 515 

AGG CTG CGG GAG AAG AAG TTT GAC CAT CAC CCT CAG CAG 1598 
Arg Leu Arg Glu Lys Lys Phe Asp His His Pro Gin Gin 
520 525 530 

TTC AGC TCC TTC CAC CAC AAC CCC GCG CAT GAC ACT AAC 1637 
Phe Ser Ser Phe His His Asn Pro Ala His Asp Ser Asn 
535 540 545 

AGC CTC CCT GCT AGC CCC TTG AGG ATA GTG GAG GAT GAG 1676 
Ser Leu Pro Ala Ser Pro Leu Arg lie Val Glu Asp Glu 
550 555 

GAG TAT GAA ACG ACC CAA GAG TAC GAG CCA GCC CAA GAG 1715 
Glu Tyr Glu Thr Thr Gin Glu Tyr Glu Pro Ala Gin Glu 
560 565 570 

CCT GTT AAG AAA CTC GCC AAT AGC COG CGG GCC AAA AGA 1754 
Pro Val Lys Lys Leu Ala Asn Ser Arg Arg Ala Lys Arg 
575 580 

ACC AAG CCC AAT GGC CAC ATT GCT AAC AGA TTG GAA GTG 1793 
Thr Lys Pro Asn Gly His lie Ala Asn Arg Leu Glu Val 
585 • 590 595 

CAC AGC AAC ACA AGC TCC CAG AGC ACT AAC TCA GAG ACT 1832 
Asp Ser Asn Thx Ser Ser Gin Ser Ser Asn Ser Glu Ser 
600 605 610 

GAA ACA GAA GAT GAA AGA GTA GGT GAA GAT ACG CCT TTC 1871 
Glu Thr Glu Asp Glu Arg Val Gly Glu Asp Thr Pro Phe 
615 620 

CTG GGC ATA CAG AAC CCC CTG GCA GCC AGT CTT GAG GCA 1910 
Leu Gly lie Gin Asn Pro Leu Ala Ala Ser Leu Qlu Ala 
625 63 0 635 

ACA CCT GCC TTC CGC CTG CCT GAC AGC AGG ACT AAC CCA 1949 
Thr Pro Ala Phe Arg L*u Ala Asp Ser Arg Thr Asn' Pro* 
640 645 

GCA GGC CGC TTC TCG ACA CAG GAA GAA ATC CAG GCC AGG 1988 
Ala Gly Arg Phe Ser Thr Gin Glu Glu He Gin Ala Ary 
650 655 660 

CTG TCT AGT GTA ATT GCT AAC CAA GAC CCT ATT GCT GTA TA 2029 
Leu Ser Ser Val lie Ala Asn Gin Asp Pro He Ala Val 
665 670 675 

A AACCTAAATA AACACATAGA TTCACCTGTA AAACTTTATT 2070 
TTATATAATA AAGTATTCCA CCTTAAATTA AACAATTTAT TTTATTTTAG 2120 
CAGTTCTGCA AATAGAAAAC AGGAAAAAAA CTTTTATAAA TTAAATATAT 2170 
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(2) INFORMATION TOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 669 amino ocids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

Ala Arg Ala Pro Gin Arg Cly Arg Ser L«u S©r Pro Sor Arg Asp 
15 10 15 

Lys Leu Phe Pro Asn Pro He Arg Ala Leu Gly Pro Asn Ser Pro 
20 25 30 

Ala Pro Arg Ala Val Arg Val Glu Arg Ser Val Ser Gly Glu Met 
35 40 45 

Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys 
50 55 60 

Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala Ala Gly Ser Gin 
65 70 75 

Ser Pro Ala Leu Pro Pro Arg Leu Lys Glu Mat Lys Ser Gin Glu 
80 • as 90 

Ser Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser 
95 100 105 

Glu lyr Ser Sor Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu 
110 115 120 

X«eu Asn Arg Lys Asn Lys Pro Gin Asn He Lys He Gin Lys Lys 
125 130 135 

Pro Gly Lys Ser Glu Leu Arg He Asn Lys Ala Ser Leu Ala Asp 
140 145 ISO 

Ser Gly Glu Tyr Met Cys Lys Val He Ser Lys Leu Gly Asn Asp 
155 160 ' *165 

Ser Ala Ser Ala Asn He Thr He Val Glu Ser Asn Glu He He 
170 175 180 

Thr Gly Met Pro Ala Ser Thr Glu Gly Ala lyr Val Ser Ser Glu 
185 190 195 

Ser Pro He Arg He Ser Val Ser Thr Glu Gly Ala Asn Thr Ser 
200 205 210 

Ser Ser Thr Ser Thr Ser Thr Thr Gly Itir Ser His Leu Val Lys 
215 220 ^ 225 

Cys Ala Glu Lye Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys 
230 235 240 

Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lye 
245 250 255 
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Cys Gin Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro 
260 265 270 

Met Lys VaI Gin Asa Gin Glu Lys Ala Glu Glu Lrou Tyr Gin Lys 
275 280 285 

Arg Val Leu Thr lie Thr Gly He Cys lie Ala Leu Leu Val Val 
290 295 300 

Gly He Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gin Arg 
305 310 315 

Lys Lys L«u His Asp Arg Leu Arg Gin Ser Leu Arg Ser Glu Arg 
320 325 330 

Asn Asn Met Met Asn He Ala Asn Gly Pro His His Pro Asn Pro 
335 340 345 

Pro Pro Glu Asn Val Gin Leu Val Asn Gin Tyr Val Ser Lys Asn 
350 355 360 

Val He Ser Ser Giu His He Val Glu Arg Glu Ala Glu Thr Ser 
3=5 370 375 

Phe Ser Thr Ser Kis Tyr Thr Ser Thr Ala His His Ser Thr Thr 
380 385 390 

Val Thr Gin Thr Pro Ser His Ser Trp Ser Asn Gly His Thr Glu 
395 400 405 

Ser He Leu Ser Glu Ser His Ser Val He Val Met Ser Ser Val 
410 41S 420 

Glu Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg 
425 430 435 

lieu Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg 
440 445 450 

His Ala Arg Glu T^jc Pro Asp Ser Tyr Arg Asp Ser Pro His Ser 
455 460 455 

Glu Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro 
470 475 480 

Val Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu 
4S5 490 495 

Met Ser Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser Met 
SCO SOS 510 

Ala Val Ser Pro Phe Met Glu Glu Glu Arg Pro Leu t,eu Leu Val 
515 520 525 

Thr Pro Pro Arg Leu Arg Glu Lys Lys Phe Asp His His Pro Glr. 

53-: 535 540 

Gin Phe Ser Ser Fhs His His Asn Pro Ala His Asp Ser Asn Ser 
545 550 555 

Leu Pro Ala Ser Pr^ Leu Arg He Val Glu Asp Glu Glu Tyr Glu 
5€: 565 S7C 
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Thr Thr Gin Clu Tyr Glu Pro Ala Gin Glu Pro Val Lys Lys L«u 
575 580 585 

Ala Asn Ser Arg Arg Ala Lys Arg Thr Lys Pro Asn Gly His He 
590 595 600 

Ala Asn Arg Leu Glu Val Asp Ser Asn Thr Ser Ser Gin Ser Ser 
605 610 615 

AsD Ser Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu Asp Thr 
620 625 630 

Pro Phe Leu Gly He Gin Asn Pro Leu Ala Ala Ser Leu Glu Ala 
635 640 645 

Thr Pro Ala Phe Arg Leu Ala Asp Ser Arg Thr Asn Pro Ala Gly 
650 655 660 

Arg Phe Ser Thr Gin Glu Glu lie Gin 
665 669 

(2) INFORMATION FOR SEQ ID NO: 9: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 732 amino acids 

(B) TVPE: amino acid 
(D) TOPOLOGY: linoeLT 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Asp Lys Leu Phe Pro Asn Pro He Arg Ala Leu Gly Pro Asn Ser 
15 10 15 

Pro Ala Pro Arg Ala Val Arg Val Glu Arg Ser Val Ser Gly Glu 
20 25 30 

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys 
35 40 45 

Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala Ala Gly Ser 
50 55 60 

Gin Ser Pro Ala Leu Pro Pro Gin Leu Lys Glu MeL Lys Ser Gin 
65 70 75 

Glu Ser Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu ?rhr ^er 
80 85 90 

Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn 
95 100 105 

Glu Leu Asn Arg Lys Asn Lys Pro Gin Asn lie Lys He Gin Lys 
110 115 120 

Lys Pro Gly Lys Ser Glu Leu Arg He Asn Lys Ala Ser Leu Ala 
125 130 135 

Asp Ser Gly Glu Tyr Met Cys Lys Val He Ser Lys Leu Gly Asn 
140 145 150 

Asp Ser Ala Ser Ala Asn He Thr He Val Glu Ser Asn Glu He 
155 160 165 

He Thr Gly Met Pro Ala Ser Thr Glu Gly Ala T^t: Val Ser Ser 



45 



EP1 114 863 A2 



15 



30 



40 



45 



50 



170 175 180 

Glu Ser Pro He Arg He Ser Val Ser Thr Glu Gly Ala Asn Thr 
185 190 195 

Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Val 
200 205 210 

Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu 
215 220 225 

Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg lyr Leu Cys 
230 235 240 

Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gin Asn Tyr Val 
245 250 255 

Met Ala Ser Phe Tyr Lys His Leu Gly lie Glu Phe Met Glu Ala 
260 265 270 

Glu Glu Leu Tyr Gin Lys Arg Val Leu Thr He Thr Gly He Cys 
275 280 285 

He Ala Leu Leu Val Val Gly He Met Cys Val Val Ala Tyr Cys 
290 295 300 

Lys Thr Lys Lys Gin Arg Lys Lys Leu His Asp Arg Leu Arg Gin 
305 310 315 

Ser Leu Arg Ser Glu Arg Asn Asn Met Met Asn He Ala Asn Gly 
320 - 325 330 

Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gin Leu Val Asn 
335 340 345 

Gin Tyr Val Ser Lvs Asn Val He Ser Ser Glu His He Val Glu 
350 355 360 

Arg Glu Ala Glu Thr Ser Phe Ser Thr Ser His Tyr rRir Ser Thr 
365 370 375 

Ala His His Ser Thr Thr Val Thr Gin Thr Pro Ser His Ser Trp 
380 365 390 

Ser Asn Gly His Thr Glu Ser He Leu Ser Glu Ser His Ser Val 
395 400 ' 465 

He Val Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro Thr 
410 415 420 

Gly Gly Pro Arg Gly Arg Leu Asn Gly Thr Gly Gly Pro Arg Glu 
425 430 435 

Cys Asn Ser Phe Leu Arg His Ala Arg Glu Thr Pro Asp Ser T/r 
440 445 450 

Arg Asp Ser Pro His Ser Glu Arg Tyr Val Ser Ala Met Thr Thr 
455 460 _ 465 

Pro Ala Arg Met Ser Pro Val Asp Phe His Thr Pro Ser Ser Pro 
470 475 480 

Lys Ser Pro Pro Ser Glu Met Ser Pro Pre Val Ser Ser Met Thr 
485 49C 495 
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Val Ser Met Pro Ser Met Ala Vol Ser Pro Phe Met Glu Glu Glu 
500 505 510 

5 Arg Pro Leu Leu Leu Val Thr Pro Pro Arg Leu Arg Glu Lys Lys 

515 520 525 

Phe Asp His His Pro Gin Gin Phe Ser Ser Phe His His Asn Pro 
530 535 540 

10 Ala His Asp Ser Asn Ser Leu Pro Ala Ser Pro Leu Arg Ilo Val 

545 550 555 

Glu Asp Glu Glu Tyr Glu Thr Thr Gin Glu Tyr Glu Pro Ala Gin 
560 565 570 

^5 Glu Pro Val Lys Lys Leu Ala Asn Ser Arg Axg Ala Lys Arg Thr 

575 580 585 

Lys Pro Asn Gly His He Ala Asn Arg Leu Glu Val Asp Ser Asn 
590 595 600 

Thr Ser Ser Gin Ser Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu 
605 €10 615 

Axg Val Gly Glu Asp Thr Pro Phe Leu Gly He Gin Asn Pro Leu 
620 625 630 

Ala Ala Ser Leu Glu Ala Thr Pro Ala Phe Arg Leu Ala Asp Ser 
635 640 645 

Arg Thr Asn Pro Ala Gly Arg Pho Ser Thr Gin Glu Glu Ho Gin 
650 655 660 

Ala Arg Leu Ser Ser Val He Ala Asn Gin Asp Pro He Ala Val 
665 670 675 

Xaa Asn Leu Asn Lys His He Asp Ser Pro Val Lys Leu Tyr Phe 
680 685 690 

He Xaa Xaa Ser He Pro Pro Xaa He Lys Gin Phe He Leu Phe 
695 700 705 

Xaa Gin Phe Cyc Lys Xaa Lys Thr Gly Lys Lys Leu Leu Xaa He 
710 715 720 

Lys Tyr Met Tyr Val Lys Mat Lys Lys Lys Lys Lys 
725 730 732 

(2) INFORMATION FOR SEQ ID NO:10: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 66 amino acids 

(B) TYPE: anxino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Ser His Leu Val Lvs Cys Ala Glu Lys Glu Lys Thr Phe Cvs Val 
1 *5 10 15 

Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser 
2C 25 30 
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Arg Tyr Leu Cys Lys Cys Gin Pro Gly Phc Thr Gly Ala Arg Cys 
35 AO 45 

Thr Glu Asn Val Pro Met Lys VaX Gin Aan Gin Glu Lys Ala Glu 
50 55 60 

Glu Leu lyr Gin Lys Arg 
65 66 

(2) INPORKATION FOR SEQ ID NO: 11: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TVPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO:ll: 

Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val 
15 10 15 

Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser 
20 25 30 

Arg Tyr Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys 
35 40 45 

Gin Asn Tyr Val Met Ala Ser Phe lyr Lys His Leu Gly lie Glu 
50 55 60 



Phe Met Glu Ala Glu Glu Leu Tyr Gin Lys Arg 
€5 70 71 

(2) INFORMATION FOR SEQ ID N0:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2010 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GGGCGCGAGC GCCTCAGCGC GGCOGCTCGC TCTCCCCCTC GAGGGACAAA 50 



CTTTTCCCAA ACCCGATCCG AGCCCTTGCA CCAAACTCCC CTGCGCCGAG 100 



AGCCGTCCGC GTAGAGOGCT CCGTCTCCGG CGAGATGTCC GAGCGCAAAG 150 



AAGGCAGAGG CAAAGGGAAG GGCAAGAAGA AGGAGCGAGG CTCCGGCAAG 200 



AAGCCGGAGT CCGCGGCGGG CAGCCAGAGC CCAGCCTTGC CTCCCCGATT 250 



GAAAGAGATG AAAAGCCAGG AATCGGCTGC AGGTTCCAAA CTAGTCCTTC 300 



GGTGTGAAAC CAGTTCTGAA TACTCCTCTC TCAGATTCAA GrGGTTC^J^ 35C 
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AATGGGAATG AATTGAATCG AAAAAACAAA CCACAAAATA TCAAOATACA 4^0 
AAAAAAGCCA GGGAAGTCAG AACTTCGCAT TAACAAAGCA TCACTGGCTG 450 
ATTCTGGAGA GTATATGTGC AAAGTGATCA GCAAATTAGG AAATGACAGT 500 
GCCTCTGCCA ATATCACCAT CGTGGAATCA AACGAGATCA TCACTGGTAT 550 
GCCAGCCTCA ACTGAAGGAG CATATGTGTC TTCAGAGTCT CCCATTAGAA 600 
TATCAGTATC CACAGAAGGA GCAAATACTT CTTCATCTAC ATCTACATCC 650 
ACCACTGGGA CAAGCCATCT TGTAAAATGT GCGGAGAAGG AGAAAACTTT 700 
CTGTGTGAAT GGAGGGGAGT GCTTCATGGT GAAAGACCTT TCAAACCCCT 750 
CGAGATACTT GTGCAAGTGC CAACCTGGAT TCACTGGAGC AAGATCTACT 800 
GAGAATGTGC CCATGAAAGT CCAAAACCAA GAAAAGGCGG AGGAGCTGTA 850 
CCAGAAGAGA GTGCTGACCA TAACCGGCAT CTGCATCGCC CTCCTTGTGG 900 
TCGGCATCAT GTGTGTGGTG GCCTACTGCA AAACCAAGAA ACAGCGGAAA 950 
AAGCTGCATG ACCGTCTTCG GCAGAGCCTT CGGTCTGAAC GAAACAATAT 1000 
GATGAACA1T GCCAATGGGC CTCACCATCC TAACCCACCC CCCGAGAATC 1050 
TCCAGCTGGT GAATCAATAC GTATCTAAAA ACGTCATCTC CAGTGAGCAT 1100 
ATTGTTGAGA GAGAAGCAGA GACATCCTTT TCCACCAGTC ACTATACTTC 1150 
CACAGCCCAT CACTCCACTA CTGTCACCCA GACTCCTAGC CACAGCTGGA 1200 
GCAACGGACA CACTGAAAGC ATCCTTTCCG AAAGCCACTC TGTAATCGTG 1250 
ATGTCATCCG TAGAAAACAG TAGGCACAGC AGCCCAACTG GGGGCCCAAG 1300 
AGGACGTCTT AATGGCACAG GACGCCCTCG TGAATGTAAC AGCTTCCTCA 1350 
GGCATGCCAG AGAAACCCCT GATTCCTACC GAGACTCTCC TCATAGTGAA 1400 
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AGtrrATOTGT CACCCATC3AC CACCCCXWCT CCTATGTCAC CTCTAGATTT 1450 
CCACACGCCA AGCTCCCCCA AATCGCCCCC TTCGGAAATG TCTCCACCCG 1500 

5 

TGTCCAGCAT GACGGTGTCC ATGCCTTCCA TGGCGGTCAG CCCCTTCATG 1550 
GAAGAAGAGA GACCTCTACT rTCTCGTGACA CCACCAAGGC TGCGGGAGAA 1600 

10 

CAAGTTTGAC CATCACCCTC AGCAGTTCAG CTCCTTCCAC CACAACCCCG 1650 
CGCATGACAG TAACAGCCTC CCTGCTAGCC CCTTGAGGAT AGTGGAGGAT 1700 

15 

GAGGAGTATG AAACGACCCA AGAGTACGAG CCAGCCCAAG AGCCTGTTAA 1750 
GAAACTCGCC AATAGCCXX3C GGGCCAAAAG AACCAAGCCC AATGGCXJ^CA 1800 

20 

TTGCTAACAG ATOXMAAGTG CACAGCAACA CAAGCTCCCA GAGCAGTAAC 1850 
TCAGAGAGTG AAACAGAAGA TGAAAGAGTA GGTGAAGATA CGCCTTTCCT 1900 

25 

GGGCATACAG AACGCCCTGG CAGCCAGTCT TGAGGCAACA CCTGCdTCC 1950 
GCCPGGCTGA CAGCAGGACT AACCCAGCAG GCCGCTTCTC GACACAGGAA 2000 

30 

GAAATCCAGG 2010 
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(2) INFORMATION FOR SEQ ID N0:13: 

(i) SEQUENCE CEARACTE31ISTICS : 
<A) X£NGTH: 669 amino acids 
(B) TJfPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

Ala Arg Ala Pro Gin Arg Gly Arg Ser Leu S«r Pro Ser Arg Asp 
1 5 10 15 

Lys Leu Phe Pro Asn Pro lie Arg Ala Leu Gly Pro Asn Ser Pro 
20 25 30 

Ala Pro Ara Ala Val Arg Val Glu Arg Ser Val Ser Gly Glu Met 
35 40 45 

Ser Glu Arg Lvs Glu Gly Arg Gly Lys Gly Lye Gly Lys Lys Lys 
50 55 60 

Glu Ara Gly Ser Gly Lvs Lvs Pro Glu Ser Ala Ala Gly Ser Gin 
cS ' ' 70 75 
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Ser Pro Ala Leu Pro Pro Arg beu hys Glu Met Lys Ser Gin Glu 
80 85 90 

Ser Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser 
95 100 105 

Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu 
110 115 120 

Leu Asn Arg Lys Asn Lys Pro Gin Asn Ho Lys He Gin Lys Lys 
125 130 135 

Pro Gly Lys Ser Glu Leu Arg He Asn Lys Ala Ser Leu Ala Asp 
140 145 150 

Ser Gly Glu Tyr Met Cys Lys Val lie Ser Lys Leu Gly Asn Asp 
155 1€0 165 

Ser Ala Ser Ala Asn He Thr He Val Glu Ser Asn Glu He 

170 175 loO 

Thr Gly Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser Glu 
185 190 i95 

Ser Pro He Arg He Ser Val Ser Thr Glu Gly Ala Asn Thr Ser 
200 205 210 

Ser Ser Thr Ser Thr Ser Thr Ttiz: Gly thr Ser His Leu Val Lys 
215 220 225 

Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys 
230 235 240 

Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyx Leu Cys Lys 
245 250 255 

Cys Gin Pro £ly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro 
260 265 270 

Met Lys Val Gin Asn Gin Glu Lys Ala Glu Glu Leu Tyr Gin Lys 
275 280 285 

Arg Val I,eu Thr He thr Gly He Cys He Ala Leu Leu Val Val 
290 295 300 

• « 

Oly H« Met Cys Val Val Ala Tyx Cys Lys Tlir Lys Lys Gin Arg 
305 310 315 

Lys Lys Leu His Asp Arg Leu Arg Gin Ser Leu Arg Ser Glu Arg 
320 325 330 

Asn Asn Met Met Asn He Ala Asn Gly Pro His His Pro Asn Pro 
335 340 345 

Pro Pro Glu Asn Val Gin Leu Val Asn Gin Tyr Val Ser Lys Asn 
350 355 360 

Val He Ser Ser Glu His He Val Glu Arg Glu Ala Glu Thr Ser 
365 370 375 

Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His His Ser Thr Thr 
380 385 390 



Val Thr Gin Thr Pro Ser His Ser Trp Ser Asn Gly His Thr Glu 
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395 400 

Ser He Leu Ser Glu Ser His Ser Val Il« Val Met Ser Ser Val 
410 415 420 

Glu Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg 
425 430 435 

Leu Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg 
440 445 450 

His Ala Arg Glu Thr Pro Asp Ser Tyx Arg Asp Ser Pro His Ser 
455 460 465 

Glu Ara Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro 
470 475 480 

Val Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser C'u 
485 490 5 

Met Ser Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser Met 
500 505 510 

Ala Val Ser Pro Phe Met Glu Glu Glu Arg Pro Leu Leu Leu Val 
515 520 525 

Thr Pro Pro Arg Leu Arg Glu Lys Lys Phe Asp His His Pro Gin 
530 535 540 

Gin Phe Ser Ser Phe His His Asn Pro Ala His Asp Ser Asn Ser 
545 - 550 555 

Leu Pro Ala Ser Pro Leu Arg lie Val Glu Asp Glu Glu Tyr Glu 
560 565 570 

Thr Thr Gin Glu Tyr Glu Pro Ala Gin Glu Pro Val Lys Lys Leu 
575 580 585 

Ala Asn Ser Arg Arg Ala Lys Arg Ita Lys Pro Aen Gly His He 
590 595 600 

Ala Asn Arg Leu Glu Val Asp Ser Asn Tbr Ser Ser Gin Ser Ser 
605 eiO 615 

Asn Ser Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu A^p Tbr 
620 ^25 630 

Pro Phe Leu Gly He Gin Acn Pro Leu Ala Ala Ser Leu Glu Ala 
635 640 645 

Ibr Pro Ala Phe Arg Leu Ala Asp Ser Arg Thr Asn Pro Ala Gly 
650 655 660 

Arg Phe Ser Thr Gin Glu Glu lie Gin 
665 €69 

(2) INFORMATION FOR SEO ID ND:14: 

(i) SEQtJENCE CHARACTE3^IST1CS: 
<A> liENGTH: 95 amino acids 
(B) TyPE: amino acid 
(D) TOPOLCCT;: linear 



52 



EP1 114 863 A2 



(xi) SEQUENCE DESC31IPTI0K: SEQ ID NO: 14: 

Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val 
15 10 15 

Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser 
20 25 30 

Arg Tyr Leu Cys Lys Cys Gin Pro Gly Phe Thr Gly Ala Arg Cys 
35 40 45 

Thr Glu Asn Val Pro Met Lys Val Gin Asn Gin Glu Lys Ala Glu 
50 55 60 

Glu Leu Tyr Gin Lys Arg Val Leu Thr lie Thr Gly He Cys He 
fiS 70 75 

Ala Leu Leu Val Val Gly He Met Cys Val Val Ala Tyr Cys Lyc 
80 85 90 

Thr Lys Lys Gin Arg 
95 

{2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 91 amino acids 

(B) TyPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu 
IS 10 15 

His Asp Gly Val Cys Met Tyr He Glu Ala Leu Asp Lys lyr Ala 
20 2S. 30 

Cys Asn Cys Val Val Gly Tyr He Gly Glu Arg Cys Gin Tyr Arg 
35 40 45 

Asp Leu Lys Trp Trp Glu Leu Arg His Ala Gly His Gly Gin Gin 
50 55 60 

Gin Lys Val He Val Val Ala Val Cys Val Val Val Leu Val Met 
65 70 ' '75 

Leu Leu Leu Leu Ser Leu Trp Gly Ala His Tyr Tyr Arg Thr Gin 
80 85 90 

Lys 
91 

(2) It?FORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 82 amino acids 

(B) TYPE: amino acid — 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 16: 

Asn Asp Cys Pro Asp Ser His Thr Gin Phe Cys Phe His Gly Thr 
15 10 15 
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Cys Arg Phe l.eu Val Gin Glu Asp Lys Pro Ala Cys Val Cys His 
20 25 30 

Ser Gly Tyr Val Gly Ala Arg Cys Glu His Ala Asp Leu Leu Ala 
35 40 45 

Val Val Ala Ala Ser Gin Lys Lys Gin Ala lie Thr Ala Leu Val 
50 55 60 

Val Val Ser lie Val Ala Leu Ala Val Leu lie lie Thr Cys Val 
65 70 75 

Leu He His Cys Cys Gin Val 
80 82 

(2) INFORMATION FOR SEO ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 87 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0rl7: 

Lys Lys Lys Asn Pro Cys Asn Ala Glu Pbe Gin Asn Phe Cys He 
1 5 10 * 15 

His Gly Glu Cys Lys Tyr He Glu His Leu Glu Ala Val Thr Cys 
20 25 30 

Lys Cys Gin Gin Glu Tyr Phe Gly Glu Arg Cys Gly Glu Lys Ser 
35 40 45 

Met Lys Tkir His Ser Met He Asp Ser Ser Leu Ser Lys He Ala 
SO 55 60 

Leu Ala Ala He Ala Ala Phe Met Ser Ala Val He Leu Thr Ala 
65 70 75 

Val Ala Val He TUsr Val Gin Leu Arg Ary Gin Tyr 
80 as 87 

(2) INPORUVTION FX)R SEQ ID N0:18: 

(i) SEQUENCE CHARACIERISTICS : 

(A) IjENGTH: 87 amino acids 

(B) TYPE: asiino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Lys Lys Lys Asn Pro Cys Ala Ala Lys Phe Gin Asn Phe Cys He 
*1 * . 5 10 15 

His Gly Glu Cys Arg TVr He Glu Asn Leu Glu Val Val Thr Cys 
20 ' 25 30 

His Cys His Gin Asp T/r Phe Gly Glu Arg Cys Gly Glu Lys Thr 
35 40 45 

Met Lys Thr Gin Lvs Lvs Asp Asp Ser Asp Leu Ser Lys He Ala 
50 ' * 55 60 
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Leu Ala Ala He He Val Phe Val Ser Ala Val Ser Val Ala Ala 
65 70 75 

He Gly He He Thr Ala Val Leu Leu Arg Lys Arg 
80 85 87 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS! 
(A) LENGTH: 86 ajnino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Lys Lys Arg Asp Pro Cys Leu Arg Lys Tyr Lys Asp Phe Cys He 
15 10 15 

His Gly Glu Cys Lys Ty^ Val Lys Glu Leu Arg Ala Pro Ser Cys 
20 25 30 

He Cys His Pro Gly Tyr His Gly Glu Arg Cys His Gly Leu Ser 
35 40 45 

Lau Pro Val Glu Acn Arg Leu Tyr Thr Oyr Acp His Thr Thr He 
50 55 60 

Leu Ala Val Va.1 Ala Val Val Leu Ser Ser Val Cys Leu Leu Val 
65 70 75 

He Val Gly 1/eu Leu Met Phe Arg Tyr His Arg 
60 85 86 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 aniino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

Arg Pro Asn Ala Arg Leu Pro Pro Gly Val Phe Tyr Cys 
1 5 10 - 13 a 

(2) INFORMATION FOR SEQ ID N0:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDiraSS : single 

(D) TOPOLOGY: linear 

(xi| SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CCTCGCTCCT TCTTCTTGCC CTTCC 25 
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(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 496 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS:. single 
<D) TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



AA AGA GCC GGC GAG GAG TTC CCC GAA ACT TGT TGG AAC 38 
Arg Ala Gly Glu Glu Phe Pro Glu Thr Cys Trp Asn 
15 10 

TCC GGG CTC GCG CGG AGG CCA GGA GCT GAG CGG CGG CGG 77 
Ser Gly Leu Ala Arg Arg Pro Gly Alo Glu Arg Arg Arg 
15 20 25 

CTG CCG GAC GAT GGG AGC GTC AGC AGG ACG GTG ATA ACC 116 
Leu Pro Asp Asp Gly Ser Val Ser Arg Thr Val He Thr 
30 35 

TCT CCC CGA TCG GGT TGC GAG GGC CCC GGG CAG AGG CCA 155 
Ser Pro Arg Ser Gly Cys Glu Gly Ala Gly Gin Arg Pro 
40 45 50 

GGA CGC GAG CCG CCA GCG GTG GGA CCC ATC GAC GAC TTC 194 
Gly Arg Glu Pro Pro Ala Val Gly Pro 11© Asp Asp Phe 
55 €0 

CCG C3GG CGA CAG GAG CAG CCC CGA GAG CCA GGG CGA GCG 233 
Pro Gly Arg Gin Glu Gin Pro Arg Glu Pro Gly Arg Ala 
65 70 75 

CCC GTT CCA GGT GGC CGG ACC GCC CGC CGC GTC CGC GCC 272 
Pro Val Pro Gly Gly Arg Thr Ala Arg Arg Val Arg Ala 
80 85 90 

GCG CTC CCT GCA GGC AAC GGG AGA CGC CCC CGC GCA GOG 311 
Ala Leu Pro Ala Gly Asn Gly Arg Arg Pro Arg Ala Ala 
95 100 

CGA GCG CCT CAG CGC GGC CGC TCG CTC TCC CCC TCG AGG 350 
Arg Ala Pro Gin Arg Gly Arg Ser Leu Ser Pro Ser Arg 
105 110 lis' 

GAC AAA CTT TTC CCA AAC CCG ATC OGA GCC CTT GGA CCA 389 
Asp Lys Leu Phe Pro Asn Pro lie Arg Ala IiCU Gly Pro 
120 125 

AAC TCG CCT GCG CCG AGA GCC GTC CGC GTA GAG CGC TCC 428 
Asn Ser Pro Ala Pro Arg Ala Val Arg Val Glu Arg Ser 
130 135 140 

GTC. TCC GGC GAG ATC TCC GAG CGC AAA GAA QGC AGA GGC 467 
Val Ser Gly Glu Met Ser Glu Arg Lys Glu Gly Arg Gly 
145 150 1S5 

AAA GGG AAG GGC AAG AAG AAG GAG CGA GG 496 
Lvs Glv Lvs Gly Lvs Lys Lys Glu Arg 
160 164 
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(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTE31ISTICS : 

(A) USNGTH: 2490 bases 

(B) TypE: nucleic acid 
(C> STRANDEDNESS : single 
(D) TOPOLOGY: linear 

<xi) SEQOENCE DESCRIPTION: SEQ ID NO: 23: 

GTGGCTGCGG GGCAATTGAA AAAGAGCCGG CGAGGAGTTC CCCGAAACTT 50 
GTTGGAACTC CGGGCTCGCG CGGAGGCCAG GAGCTGAGCG GCGGCGGCTG 100 
CCGGACGATG GGAGCGTCAG CAGGACGGTG ATAACCTCTC CCCGATCGGG 150 
TTGCGAGGGC GCCGGGCAGA GGCCAGGACG CGACCCCCCA GCGGCC3GGAC 200 
CCATCGACGA CTTCCCGGGG CGACAGGAGC AGCCCCGAGA GCCAGGGCGA 250 
GCGCCCGTTC CAGGTGGCCG GACCGCCCGC CGCGTCCGCG CCGCGCTCCC 300 
TGCAGGCAAC GGGAGACGCC CCCGCGCACC GCGAGCCCCT CAGCGCGGCC 350 
GCTCGCTCTC CCCATCGAGG GACAAACTTT TCCCAAACCC GATCCGAGCC 400 
CTTGGACCAA ACTCGCCTGC GCCGAGAGCC GTCCGCGTAG AGCGCTCCGT 450 



CTCCGGCGAG ATG TCC GAG CGC AAA GAA GGC AGA GGC AAA 490 
Met Ser Glu Arg Lys Glu Gly Arg Gly Lys 
15 10 

GGG AAG GGC AAG AAG AAG GAG CGA GGC TCC GGC AAC AAG 529 
Gly Lys Gly Lys Lys Lys Glu Arg Gly Ser Gly Lys Lya 
15 20 

CCG GAG TCC GCG CCG GGC AGC CAG ACC CCA GCC TTG OCT 568 
Pro Glu Ser Ala Ala Gly Ser Gin Ser Pro AJa Leu Pro 
25 30 35 

CCC CAA TTG AAA GAG ATG AAA AGC CAG GAA TCC GCT CCA 607 
Pro Gin Leu Lys Glu Met Lys Ser Gin Glu Ser Ala Ala 
40 45 

GGT TCC AAA CTA GTC CTT CGG TGT GAA ACC ACT TCT GAA 646 
Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu 
50 55 60 

TAC TCC TCT CTC AGA TTC AAG TGG TTC AAG AAT GGG AAT 685 
Tvr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn 
65 70 75 
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GAA TTO AAT CGA AAA AAC AAA CCA CAA AAT ATC AAG ATA 724 
Glu Leu Asti Arg Lys Asn Lys Pro Gin Asn lie Lys lie 
80 85 

5 CAA AAA AAG CCA GGG AAG TCA GAA CTT CGC ATT AAC AAA 763 

Gin Lys Lys Pro Gly Lys Ser Glu Leu Arg lie Asn Lys 
90 95 . 100 

GCA TCA CTG GCT GAT TCT GGA GAG TAT ATG TGC AAA GTG 802 
Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cye Lys Val 
10 ICS 110 

ATC AGC AAA TTA GGA AAT GAC AGT GCC TCT GCC AAT ATC 841 
He Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn He 
115 120 125 

15 ACC ATC GTG GAA TCA AAC GAG ATC ATC ACT GGT ATG CCA 880 

Thr He Val Glu Ser Asn Glu lie He Thr Gly Met Pro 
130 135 140 

GCC TCA ACT GAA GGA GCA TAT GTG TCT TCA GAG TCT CCC 919 
Ala Ser Thr Glu Gly Ala Tyx Val Ser Ser Glu Ser Pro 
20 145 150 

ATT AGA ATA TCA GTA TCC ACA GAA GGA GCA AAT ACT TCT 958 
He Arg He Ser Val Ser Thr Glu Gly Ala Asn Thr Ser 
155 160 165 

25 TCA TCT ACA TCT ACA TCC ACC ACT GGG ACA AGC CAT CTT 997 

Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu 
170 175 

GTA AAA TGT GCG GAG AAG GAG AAA ACT TTC TGT GTG AAT 1036 
Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn 
30 180 185 190 

OCA GGG GAG TGC TTC ATG GTG AAA GAC CTT TCA AAC CCC 1075 
Gly Gly Glu Cys Phe Met Val Lys Asp JJeu Ser Asn Pro 
195 200 205 

35 TCG AGA TAC TTG TGC AAG TGC CCA AAT GAG TTT ACT GGT 1114 

Ser Arg Tyr Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly 
210 215 

GAT CGC TGC CAA AAC TAC GTA ATG GCC AGC TTC TAC ^AAG 1153 
Asp Arg Cys Gin Asn ^yr Val Met Ala Ser Phe Tyr Lys * 
40 220 225 230 

GCG GAG GAG CTG TAC CAG AAG AGA GTG CTG ACC ATA ACC 1192 
Ala Glu Glu Leu Tyr Gin Lvs Arg Val Leu Thr He Thr 
235 240 

45 GGC ATC TGC ATC GCC CTC CTT GTG GTC GGC ATC ATG TGT 1231 

Gly He Cys He Ala Leu Leu Val Val Gly He Met Cys 
245 250 255 

GTG GTG GCC TAC TGC AAA ACC AAG AAA CAG CGG AAA AAG 1270 
Val Val Ala Tvx Cys Lys Thr Lys Lys Gin Arg Lye Lys 
50 260 265 270 

CTG CAT GAC CGT CTT CGG CAG AGC CTT CGG TCT GAA CGA 1309 
Leu His Tisp Arg l^eu Arg Gin Ser Leu Arg Ser Glu Arg 
275 280 
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AAC AAT ATG ATC AAC ATT GCC AAT GGG CCT CAC CAT CCT 1348 
Asn Asn Met Met Asn lie Ala Asn Oly Pro His His Pro 
285 290 295 

5 AAC CCA CCC CCC GAG AAT GTC CAG CTG GTG AAT CAA TAC 1387 

Asn Pro Pro Pro Glu Asn Val Gin Leu Val Asn Gin T/r 
300 305 

GTA TCT AAA AAC GTC ATC TCC ACT GAG CAT ATT CTT GAG 1426 
Val Ser Lys Asn Val lie Sex Ser Glu His lie Val Glu 
10 310 315 320 

AGA GAA GCA GAG ACA TCC TTT TCC ACC AGT CAC TAT ACT, 1465 
Arg Glu Ala Glu Thr Ser Phe Ser Thr Ser His Tyr Thr 
325 330 335 

TCC ACA GCC CAT CAC TCC ACT ACT GTC ACC CAG ACT CCT 1504 
Ser Thr Ala His His Ser Thr Thr Val Thr Gin Thr Pro 
340 345 



20 



ACC CAC AGC TGG AGC AAC GGA CAC ACT GAA AGC ATC CTT 1543 
Ser His Ser Trp Ser Asn Cly His Thr Glu Ser lie Leu 
350 355 360 

TCC GAA AGC CAC TCT GTA ATC GTG ATG TCA TCC GTA GAA 1582 
Ser Glu Ser His Ser Val lie Val Met Ser Ser Val Glu 
365 370 

AAC ACT AGG CAC AGC AGC CCA ACT GGG GGC CCA AGA GGA 1621 
Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly 
375 380 385 



30 



45 



50 



CGT CTT AAT GGC ACA GGA GGC CCT CGT GAA TGT AAC AGC 1660 
Arg Lou Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser 
390 395 400 

TTC CTC AGG CAT GCC AGA GAA ACC CCT GAT TCC TAC CGA 1699 
Phe Leu Arg His Ala Arg Glu l^ir Pro Asp Ser Tyr Arg 
405 410 

GAC TCT CCT CAT AGT GAA AGG TAT GTG TCA GCC ATG ACC 1738 
Asp Ser Pro His Ser Glu Arg lyr Val Ser Ala Met Thr 
415 420 - 425 

ACC CCG GCT CGT ATG TCA CCT GTA GAT TTC CAC ACG^ CCA^ 1777 
Thr Pro Ala Arg Met Ser Pro Val Asp Phe His Tbx Pro* 
430 435 

AGC TCC CCC AAA TCG CCC CCT TCG GAA ATG TCT CCA CCC 1816 
Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser Pro Pro 
440 445 450 

GTG TCC AGC ATG ACG GTG TCC AAG CCT TCC ATG GCC GTC 1855 
Val Ser Ser Met Thr Val Ser Lys Pro Ser Met Ala Val 
455 460 465 

AGC CCC TTC ATC GAA GAA GAG AGA CCT CTA CTT CTC GTG 1894 
Ser Pro Phe Met Glu Glu Glu Arg Pro Leu Leu Leu Val 
470 475 

ACA CCA CCA AGG CTG CGG GAG AAG AAG TTT GAC CAT CAC 1933 
Thr Pro Pro Arg Leu Arg Glu Lys Lys Phe Asp His His 
480 485 490 
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CCT CAG CAG TTC ACC TCC TTC CAC CAC AAC CCC GCC CAT 1972 
Pro Gin Oln Phe Ser Ser Phe Hie His Asn Pro Ale. His 
495 500 

5 GAC AGT AAC AGC CTC CCT GCT AGC CCC TTG AGG ATA GTG 2011 

Asp Ser Asn Ser Leu Pro Ala Ser Pro Leu Arg lie Vol 
505 510 515 

GAG GAT GAG GAG TAT GAA ACG ACC CAA GAG TAC GAG CCA 2050 
Glu Asp Glu Glu Tyr Glu Thr Thr Gin Glu Tyr Glu Pro 
10 520 525 530 

GCC CAA GAG CCT GTT AAG AAA CTC GCC AAT AGC CGG CGG 2089 
Ala Gin Glu Pro Val Lys Lys Leu Ala Asn Ser Arg Arg 
535 540 

'5 GCC AAA AGA ACC AAG CCC AAT GGC CAC ATT GCT AAC AGA 2128 

Ala Lys Arg Thr Lys Pro Asn Gly His He Ala Asn Arg 
545 550 555 

TTG GAA GTG GAC AGC AAC ACA AGC TCC CAG AGC AGT AAC 2167 
Leu Glu Val Asp Ser Asn Thr Ser Ser Gin Ser Ser Asn 
20 560 565 

TCA GAG AGT GAA ACA GAA GAT GAA AGA GTA GGT GAA GAT 2206 
Ser Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu Asp 
570 575 580 

25 ACG CCT TTC CTG GGC ATA CAG AAC CCC CTG GCA GCC AGT 2245 

Ihr Pro Phe Leu Gly He Gin Asn Pro Leu Ala Ala Ser 
585 590 595 

CTT GAG GCA ACA CCT GCC TTC CGC CTG GCT GAC AGC AGG 2284 
Leu Glu Ala Thr Pro Ala Phe Arg Leu Ala Asp Ser Arg 
30 600 60S 

ACT AAC CCA GCA GGC CGC TTC TCG ACA CAG GAA GAA ATC 2323 
Thr Asn Pro Ala Gly Arg Phe Ser Thr Gin Glu Glu He 
610 615 620 

35 CAG GCC AGG CTG TCT AGT GTA ATT GCT AAC CAA GAC CCT 2362 

Gin Ala Arg Leu Ser Ser Val He Ala Asn Gin Asp Pro 
625 630 

ATT CCT GTA TAAAACCTA AATAAACACA TAGATTCACC TGTAAAACrT 2410 
He Ala Val 
€35 637 

TMTTTTAIAT AATAAAGTAT TCCACCTTAA ATTAAACAAT TTATTTTATT 2460 
TTAGCAGTTC TGCAAATAAA AAAAAAAAAA 2490 

45 



INFORMATION FOR SZQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1715 .bases 

(B) TYPE: nucleic acid 
CO STRAITDEDNESS: single 
(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 24: 

GCGCCTGCCT CCAACCTGCG GGCGGGAGGT GGGTGGCTGC GGGGCAATTG 50 
AAAAAGACCC GGCGAGGAGT TCCCCGAAAC TTCTTGGAAC TCCGGGCTCG 100 
CGCGGAOGCC AGGAGCTGAG CGGCGGCGGC TGCCGGACGA TCGGAGCGTG 150 
AGCAGGACGG TGATAACCTC TCCCCGATCG GGTTGCGAGG GCGCCGGGCA 200 
GAGGCCAGGA CGCGAGCCGC CAGCGGCGGG ACCCATCGAC GACTTCCCGG 250 
GGCGACAGGA GCAGCCCCGA GAGCCAGGGC GAGCGCCCGT TCCAGGTt GC 300 
CGGACCGCCC GCCGCGTCCG CGCCGCGCTC CCTCCAGGCA ACGGGAGACG 350 
CCCCCGCGCA GCGCGAGCGC GTCAGCGCCG CCGCTCGCTC TCCCCATCGA 400 
GGGACAAACT TTTCCCAAAC CCGATCCGAG CCCTTGGACC AAACTCGCCT 450 



GCGCCGAGAG CCGTCOGCGT AGAGCGCTCC GTCTCCGGCG AG ATG 495 

Met 
1 

TCC GAG COG AAA GAA GGC AGA GGC AAA CGC AAG CGC AAC 534 
Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys 
5 10 

AAG AAG GAG CGA GGC TCC GGC AAC AAG CCG GAG TCC GCG 573 
Lys Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala 
15 20 25 

GCG GGC AGC CAG AGC CCA GCC TTG OCT CCC CAA TTG AAA 612 
Ala Gly Ser Gin Ser Pro Ala Leu Pro Pro Gin Leu Lys 
30 35 ^ 40 

CAG ATG AAA AGC CAG CAA TOG OCT CCA GCT TCC AAA CTA 651 
Glu Met Lys Ser Gin Glu Ser Ala Ala Gly Ser Lys Leu 
45 50 

GTC CTT CGG TGT GAA ACC AGT TCT GAA TAC TCC TCT CTC 690 
Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu 
55 60 €5 

AGA TTC AAG TGG TTC AAG AAT GGC AAT CAA TTG AAT CGA 729 
Arg Phe Lvs Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg 
70 75 

AAA AAC AAA CCA CAA AAT ATC AAG ATA CAA AAA AAG CCA 768 
Lys Asn Lys Pro Gin Asn lie Lys lie Gin Lys Lys Pro 
80 85 90 
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GOG AAG TCA GAA CTT CGC JOT AAC AAA GCA TCA CTG GCT 807 
Gly Lye Ser Glu I*«u Arg He Asn Lys Ala S«r Leu Ala 
95 100 105 

GAT TOT GGA GAG TAT ATG TGC AAA GTG ATC AGC AAA TTA 846 
Asp Ser Gly Glu Tyr Met Cys Lys Val lie Ser Lys Leu 
110 115 

GGA AAT GAC AGT GCC TCT GCC AAT ATC ACC ATC GTG GAA 885 
Gly Asn Asp Ser Ala Ser Ala Asn He Thr He Val Glu 
120 125 130 

TCA AAC GAG ATC ATC ACT GGT ATG CCA GCC TCA ACT GAA 924 
Ser Asn Glu He lie Thr Gly Met Pro Ala Ser Thr Glu 
135 140 

GGA GCA TAT GTG TCT TCA GAG TCT CCC ATT AGA ATA TCA 963 
Gly Ala Tyr Val Ser Ser Glu Ser Pro He Arg He Se; 
145 150 155 ? 

GTA TCC ACA GAA GGA GCA AAT ACT TCT TCA TCT ACA TCT 1002 
Val Ser Thr Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser 
ISO 165 170 

ACA TCC ACC ACT GGG ACA AGC CAT CTT GTA AAA TGT GCG 1041 
Thr Ser Thr Thr Gly Thr Ser His Leu Val Lys Cys Ala 

175 lao 

GAG A7^ GAG AAA ACT TTC TGT GTG AAT GGA GGG GAG TGC 1080 
Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys 
185 . 190 195 

TTC ATG GTG AAA GAC CTT TCA AAC CCC TCG AGA TAC TTG 1119 
Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyx X*eu 
200 205 

TGC AAG TGC CCA AAT GAG TTT ACT GGT GAT CGC TGC CAA 1158 
Cys Lys Cys Pro Asn Glu Phe Hhr Gly Asp Arg Cys Gin 
210 215 220 

AAC TAC GTA ATG GCC AGC TTC TAC AGT ACG TCC ACT CCC 1197 
Asn Tyr Val Met Ala Ser Phe lyr Ser Thr S«r Thr Pro 
22S 230 235 

TTT CTG TCT CTG CCT GAA TAGGA GCATGCTCAO TTGGTOCTGC^1240 
Phe Leu Ser Leu Pro Glu 
240 241 

TTTCTTGTTG CTGCATCTCC CCTCAGATTC CACCTAGAGC TAGATGTGTC 129C 
TTACCAGATC TAATATTGAC TGCCTCTGOC TGTCGCATGA GAACATTAAC 1340 
AAAAGCAATT GTATTACTIC CTCTGTTOGC GACTAGTTGG CTCTGAGATA 1390 
CTAATAGGTG TGTGAGGCTC CGGATGTTTC TGGAATTGAT ATTGAATGAT 144C- 
GTGATACAAA TTGATAGTCA ATATCAAGCA GTGAAATATG ATAATAAAGG 149 C 
CATTTCAAAG TCTCACTTTT ATTGATAAAA TAAAAATCAT TCTACTGAAC 154 C 
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AGTCCATCTT CTTTATACAA TGACCACATC CTGAAAAGGG TGTTGCTAAG 1590 
CTGTAACCGA TATGCACTTG AAATGATGGT AAGTTAATTT TGATTCAGAA 1640 
TGTGTTATTT GTCACAAATA AACATAATAA AAGGAGTTCA GATGTTTTTC 1690 
TTCATTAACC AAAAAAAAAA AAAAA 1715 

2) INFORMATION FOR SEQ ID NO: 25: * 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2431 basos 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : N.A. 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



GAGGCGCCTG CCTCCAACCT GCGGGCGGGA GGTGCGTGGC TGCGGGGCAA 50 



TTGAAAAAGA GCCGGCGAGG AGTTCCCCOA AACTTGTTGG AACTCCGGGC 100 



TCGCGCGGAG GCCAGGAGCT GAGCGGCGGC GGCTGCCGGA CGATGGGAGC 150 



GTGAGCAGGA CGGTGATAAC CTCTCCCCGA TCGGGTTGCG AGGGCGCCGG 200 



GCAGAGGCCA GGACGCGAGC CGCCACCGGC GGGACCCATC GACQACTTCC 250 



CGGGGCGACA GGAGCAGCCC CGAGAGCCAG GGCGACCGCC CGTTCCAGGT 300 



GGCCGGACCG CCCGCCGCGT CCGCGCCGCG CTCCCTGCAG GCAACGGjSAG 350 



ACGCCCCCCC GCAGCCCGAG CGCCTCAGCG CGGCCGCTCG CTCTCCCCAT 400 



CGAGGGACAA ACTTTTCCCA AACCCGATCC GAGCCCTTGG ACCAAACTCG 450 



CCTGCGCCCA GAGCCGTCCG CGTAGAGCGC TCCGTCTCCG GCGAG AT 497 

Met 
1 

G TCC GAG CGC AAA GAA GGC AGA GGC AAA GGG AAG GGC AAG 537 
Ser Clu Arg Lyc Glu Gly Arg Gly Lys Gly Lye Gly Lys 
5 10 

AAG AAG GAG CGA GGC TCC GGC AAG AAG CCG GAG TCC GCG 576 
Lys Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala 
15 20 25 
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GCG GGC AGC CAG AGC CCA GCC TTG CCT CCC CAA TTC AAA 615 
Ala Gly Ser Gin Ser Pro Ala Leu Pro Pro Gin Leu Lys 
30 35 40 

GAG ATG MiA AGC CAG GAA TCG GCT GCA GGT TCC AAA CTA €54 
Glu Met Lys Ser Gin Glu Ser Ala Ala Gly Ser Lys Leu 
45 50 

GTC crV CGG TGT GAA ACC AGT TCT GAA TAG TCC TOT CTC 693 
Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu 
55 60 65 

AGA TTC TGG TO: AAG AAT GGG AAT GAA TTG AAT CGA 732 

Arg Phe Lys Trp Phe Lys Asn Gly Jiso Glu l*eu Asn Arg 
70 75 

AAA AAC AAA CCA CAA AAT ATC AAG ATA CAA AAA AAG CCA 771 
Lys Asn Lys Pro Gin Asn He Lys II© Gin Lys Lys Pro 
80 85 90 

GGG AAG TCA GAA CTT CGC ATT AAC AAA GCA TCA CTO GCT 8X0 
Gly Lys Ser Olu Leu Arg He Asn Lys Alex Ser Leu Ala 
9S 100 105 

GAT TCT GGA GAG TAT ATG TGC AAA GTG ATC AGC AAA TXA 849 
Asp Ser Gly Glu lyr Met Cys Lys Val lie Ser Lys Leu 
110 US 

GGA AAT GAC AGT GCC TCT GCC AAT ATC ACC ATC GTG GAA 888 
Gly Asn Asp Ser Ala Ser Ala Asn He Ttir He Val Glu 
120 125 130 

TCA AAC GAG ATC ATC ACT GGT ATG CCA GCC TCA ACT GAA 927 
Ser Asn Glu He lie Thr Gly Met Pro Ala Ser Thr Glu 
135 140 

GGA GCA TAT GTG TCT TCA GAG TCT CCC ATT AGA ATA TCA 966 
Gly Ala lyr Val Ser Ser Glu Ser Pro 11© Arg He Ser 
145 150 155 

GTA TCC ACA GAA GGA GCA AAT ACT TCT TCA TCT ACA TCT 1005 
Val Ser Thr Glu Gly Ala Asn Thr Ser Ser Ser Thr Sex 
160 165 170 

ACA TCC ACC ACT COG ACA AGC CAT CTT GTA AAA TGT'gCC*1044 
Thr Ser Thr Thx Gly Thr Ser His Leu Val Ly» Cys Ala 
175 180 

GAG AAG GAG AAA ACT TTC TGT GTG AAT GGA GGG GAS TGC 1083 
Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys 
185 190 195 

TTC ATC GTG AAA GAC CTT TCA AAC CCC TCG AGA TAC TTC 1122 
Phe Met Val Lvs Asp Leu Ser Asn Pro Ser Arg Tyr Leu 
20O 205 

TGC AAG TGC CCA AAT GAG TTT ACT GGT GAT CCC TCC CAA 1161 
Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gin 
210 215 220 

AAC TAC GTA ATC GCC AGC TTC TAC AAG GCG GAG GAG CTC 1200 
Asn Tyx Val Met Ala Ser Phe TVr Lys Ala Glu Glu Leu 
225 230 235 
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20 



TAG CAG AAG AGA GTG CTG ACC ATA ACC GGC ATC TGC ATC 1239- 
T/r Gin Lys Arg Val Leu Thx lie Thr Gly lie Cys lie 
240 245 

GCC CTC CTT GTG GTC GGC ATC ATG TGT GTG GTG GCC TAC 1278 
Ala Leu Leu Val Val Gly He Met Cys Val Val Ala Tyr 
250 255 2G0 

TCC AAA ACC AAG AAA CAG CGG AAA AAG CTC CAT GAC CGT 1317 
Cys Lys Thr Lys Lys Gin Arg Lys Lys Leu His Asp Arg 
265 270 

CTT CGG CAG AGC CTT CGG TCT GAA CGA AAC AAT ATG ATG 1356 
Leu Arg Gin Ser Leu Arg Ser Glu Arg Asn Asn Met Met 
275 280 285 

AAC ATT GCC AAT GGG CCT CAC CAT CCT AAC CCA CCC CCC 1395 
Asn He Ala Asn Gly Pro Hia His Pro Asn Pro Pro Pro 
290 295 300 

GAG AAT GTC CAG CTG GTG AAT CAA TAC GTA TCT AAA AAC 1434 
Glu Asn Val Gin L«u Val Asn Gin Tyr Val Ser Lyc Asn 
305 310 

GTC ATC TCC ACT GAG CAT ATT GTT GAG AGA GAA CCA GAG 1473 
Val He Ser Ser Glu His He Val Glu Arg Glu Ala Glu 
315 320 325 

25 

ACA TCC TTT TCC ACC AGT CAC TAT ACT TCC AC A GCC CAT 1512 
Thr Ser Phe Ser- Hit Ser His lyr Thr Ser Thr Ala His 
330 335 

CAC TCC ACT ACT GTC ACC CAG ACT CCT AGC CAC AGC TGG 1551 
30 His Ser Thr Thr Val Thr Gin Thr Pro Ser His Ser Trp 

340 345 350 

AGC AAC GGA CAC ACT GAA AGC ATC CTT TCC GAA AGC CAC 1590 
Ser Asn Gly Hia Thr Glu Ser He Leu Ser Glu Ser His 
355 360 365 

35 

TCT GTA ATC CTC ATC TCA TCC GTA GAA AAC AGT AGG CAC 1629 
Ser Val He Val Met Ser Ser Val Olu Asn Ser Arg His 
370 375 

AGC AGC CCA ACT GGG GGC CCA AGA GGA CGT CTT AAT GGC 1668 
40 Ser Ser Pro Thr Gly Gly Pro Ary Gly Arg Leu Asn Gly 

380 3B5 . 390 

ACA GGA GGC CCT CGT GAA TGT AAC AGC TTC CTC AGG CAT 1707 
Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg His 
395 400 



45 



GCC AGA GAA ACC CCT GAT TCC TAC CGA GAC TCT CCT CAT 174 6 
Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His 
405 410 415 

AGT GAA AGG TAAAA CCGAAGGCAA AGCTACTGCA GAGGAGAAAC 1790 
Ser Glu Arg 
420 

TCACTCACAC AATCCCTGTG AGCACCTGCG GTCTCACCTC AGGAAATCTA 184 C 
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CTCTAATCAG AATAAGGC3GC GGCACSTTACC TGTTCTAGGA GTC2CTCCTAG 1890 
TTGATGAAGT CATCTCTTTG TTTGACGGAA CTTATTTCTT CTGAGCTTCT 1940 
CTCGTCGTCC CAGTGACTGA CAGGCAACAQ ACTCTTAAAG AGCTGGGATG 1990 
CTTTGATGCG GAAGGTCCAG CACATGGAGT TTCCAGCTCT GGCCATCGGC 2040 
TCAGACCCAC TCGGOGTCTC AGTGTCCTCA GTTGTAACAT TAGAGAGATG 2090 
^5 GCATCAATGC TO3ATAAGGA CCCTTCTATA AITCCAATTG CCAGTTATCC 2140 

AAACTCTGAT TCGCTGGTCG AGCTCGCXTTC GTGTTCTTAT CTGCTAACCC 2190 
20 TGTCTTACCT TCCAGCCTCA GTTAAGTCAA ATCAAGGGCT ATGTCATTGC 2240 

TGAATGTCAT GOGCGGCAAC TGCTTGCCCT CCACCCTATA CTATCTATTT 229 C 
25 TATGAAATTC CAAGAAGGGA TGAATAAATA AATCTCTTOG ATCCTGCX3TC 2340 

TOGCAGTCTT CACXSGGTGGT TTTCAAAGCA GAAAAAAAAA AAAAAAAAAA 2390 
30 AAAAAAAAAA AAAAAAAAAA AAAAAAATlAA AAAAAAAAAA A 2431 



(2) INFDRMATION FOR SEQ ID KD:26t 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: €25 axoino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys 
15 10 15 

Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala Ala Gly Ser 
20 25 30 

Gin Ser Pro Ala Leu Pro Pro Arg I^u Lys Glu Met Lys Ser Gin 
35 40 45 

Glu Ser Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Ttar Ser 
50 55 €0 

Ser Glu Tvr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn 
65 70 75 

Glu Leu Asn Arg Lvs Asn Lys Pro Gin Asn lie Lys He Gin Lys 
80 85 9G 



40 



45 



50 



55 
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Lys Pro Gly Lys Ser Glu Leu Arff He Acn Lys Ala Ser Leu Ala 
95 100 105 

Asp Ser Gly Glu Tyr Met Cys hys Val He Ser Lys Leu Gly Asn 
110 115 120 

Asp Ser Ala Ser Ala Asn He Thr He Val Glu Ser Asn Glu He 
125 130 135 

He Thr Gly Met Pro Ala Ser Thr Glu Gly Ala Oyr Val Ser Ser 
140 145 150 

Glu Ser Pro He Arg He Ser Val Ser Thr Glu Gly Ala Asn Thr 
155 160 165 



15 



Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Val 
170 17S 180 



Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu 
185 190 195 



20 



Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys 
200 205 210 



Lys Cys Gin Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val 
215 220 225 



25 



Pro Met Lys Val Gin Asn Gin Glu Lys Ala Glu Glu Leu 'Tyr Gin 
230 235 240 



Lye Arg Val Leu Thr He Thr Gly He Cys He Ala Leu Leu Val 
245 250 255 



30 



Val Gly He Met Cys Val Val Ala Tyx Cys Lys Thr Lys Lys Gin 
260 265 270 



Arg Lys Lys Leu His Asp Arg Leu Arg Gin Ser Leu Arg Ser Glu 
275 280 285 



35 



Arg Asn Asn Met Met Asn He Ala Asn Gly Pro His His Pro Asn 
290 295 300 



Pro Pro Pro Glu Asn Val Gin Leu Val Asn Gin Tyr Val Ser Lys 
305 310 315 



40 



Asn Val He Ser Ser Glu His He Val Glu Arg Glu Ala Glu Tbr 
320 325 330 



Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His Bis Ser Thr 
335 340 345 



45 



Thr Val Thr Gin Thr Pro Ser His Ser Trp Ser Asn Gly His Thr 
350 355 360 



Glu Ser He Leu Ser Glu Ser His Ser Val lie Val Met Ser Ser 
365 370 375 



50 



Val Glu Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Apg Gly 
380 385 390 



Arg Leu Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu 
395 400 405 
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Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His 
410 415 420 

Ser Glu Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser 
425 430 435 

Pro Val Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser 
440 445 450 

Glu Met Ser Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser 
455 460 4£S 

Met Ala Val Ser Pro Phe Met Glu Glu Glu Arg Pro Leu Leu Leu 
470 475 480 

,5 Val Thr Pro Pro Arg Leu Arg Glu Lys Lys Phe Asp His His Pro 

485 490 495 

Gin Gin Phe Ser Ser Phe His His Asn Pro Ala His Asp Ser Asn 
500 505 510 

20 Ser Leu Pro Ala Ser Pro Leu Arg lie Val Glu Asp Glu Glu Tyx 

515 520 52S 

Glu Thr Ttxr Gin Glu Tyr Glu Pro Ala Glu Glu Pro Val Lys Lye 
530 535 540 

25 Leu Ala Asn Ser Arg Arg Ala Lys Arg Thr Lys Pro Asn Gly His 

545 550 555 

lie Ala Asn Arg Leu Glu Val Asp Ser Asn Thr Ser Ser GIjq Ser 
560 565 570 

30 Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu Asp 

575 580 585 

Thr Pro Phe Leu Gly lie Gin Asn Pro Leu Ala Ala Ser Leu Glu 
590 595 600 

35 Ala Thr Pro Ala Phe Arg Leu Ala Asp Ser Arg Ohr Asn Pro Ala 

605 610 615 

Gly Arg Phe Ser Thr Gin Glu Glu He Gin 
620 625 

40 (2) INFOBMATION FOR SEQ ID N0:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 645 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

45 

(jci) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys 
15 10 15 

50 Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala Ala Gl>L.Ser 

20 25 30 

Gin Ser Pro Ala Leu Pro Pro Gin Leu Lys Glu Met Lys Ser Gin 
35 40 45 

55 
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Glu Sex Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser 
50 55 60 

Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn 
65 70 75 

Glu Leu Asn Arg Lys Asn Lys Pro Gin Asn lie Lys lie Gin Lys 
80 85 90 

Lys Pro Gly Lys Ser Glu Leu Arg lie Asn Lys Ala Ser Leu Ala 
95 100 105 

Asp Ser Gly Glu Tyr Met Cys Lys Val He Ser Lys Leu Gly Asn 
110 115 120 

Asp Ser Ala Ser Ala Asn He Thr He Val Glu Ser Asn Glu He 
125 130 135 

He Thr Gly Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser 
140 145 150 

Glu Ser Pro He Arg He Ser Val Ser Thr Glu Gly Ala Asn Thr 
155 160 165 

Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Val 
170 175 180 

Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu 
185 190 195 

Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyx Leu Cys 
200 205 210 

Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gin Asn ^lyr Val 
215 220 225 

Met Ala Ser Phe Tyr Lys Hie Leu Gly He Glu Phe Met Glu Ala 
23 0 235 240 

Glu Glu Leu Tyr Gin Lys Arg Val Leu Thr Ho Thr Gly He Cys 
245 250 255 

He Ala Lou Leu Val Val Gly He Met Cys Val Val Ala Tyx Cys 
260 265 270 

Lys Thr Lys Lys Gin Arg Lys Lys Leu His Asp Arg Leu Arg Qln 
275 280 285 

Ser Leu Arg Ser Glu Arg Asn Asn Met Met Asn He Ala Asn Gly 
290 295 300 

Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gin Leu Val Asn 
305 310 315 

Gin Tyr Val Ser Lys Asn Val He Ser Ser Glu His He Val Glu 
320 325 330 

Arg Glu Ala Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr 
335 340 345 



Ala His His Ser Thr Thr Val Thr Gin Thr Pro Ser His Ser Trp 
350 355 . 360 
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Ser Afin Gly His Thr Glu Ser He Leu Ser Glu Smr His Set Val 
365 370 375 

3 He Val Met Ser Ser VaX Glu Asn Ser Arg His Ser Ser Pro Thr 

380 385 390 

Gly Gly Pro Arg Gly Arg Leu Asn Gly Thr Gly Gly Pro Arg Glu 
395 400 405 

Cys Asn Ser Pbe Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr 
410 415 420 

Arg Asp Ser Pro His Ser Glu Arg Tyr Val Ser Ala Met Thr Thr 
425 430 435 

15 Pro Ala Arg Met Ser Pro Val Asp Phe His Thr Pro Ser Ser Pro 

440 445 450 

Lys Ser Pro Pro Ser Glu Met Ser Pro Pro Val Ser Ser Met Thr 
455 460 465 

20 Val Ser Met Pro Ser Met Ala Val Ser Pro Phe Met Glu Glu Glu 

470 475 480 

Arg Pro Leu Leu Leu Val Thr Pro Pro Arg Leu Arg Glu Lys Lys 
485 490 495 

25 Phe Asp His His Pro Gin Gin Phe Ser Ser Phe His His Asn Pro 

500 505 510 

Ala His Asp Ser Asn Ser Leu Pro Ala Ser Pro Leu Arg He Val 
515 520 525 

30 Glu Asp Glu Glu lyr Glu Thr Thr Gin Glu Tyr Glu Pro Ala Gin 

530 535 540 

Glu Pro Val Lys Lys Leu Ala Asn Ser Arg Arg Ala Lys Arg Thr 
545 S50 555 

35 Lys Pro Asn Gly His He Ala Asn Arg Leu Glu Val Asp Ser Asn 

560 565 570 

rttir Ser Ser Gin Ser Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu 
575 580 585 

40 Arg Val Gly Glu Asp Thr Pro Phe Leu Gly He Gin Asn Pro Leu 

590 595 600 

Ala Ala Ser Leu Glu Ala Thr Pro Ala Phe Arg Leu Ala Asp Ser 
605 ^10 615 

45 Arg Thr Asn Pro Ala Gly Arg Phe Ser Thr Gin Glu Glu He Gin 

620 625 630 

Ala Arg Leu Ser Ser Val He Ala Asn Gin Asp Pro He Ala Val 
635 640 645 



50 



(2) INFORMATION FOR SEQ ID NO: 28: 



(i) SEQOENCE CHARACTERISTICS: 

(A) LENGTH: 537 amino acids 

(B) TlfPE: fiimino acid 
55 (D) TOPOLOGTi: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2B: 

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lye Gly Lys Lys 
15 10 15 

Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala Ala Gly Ser 
20 25 30 

Gin Ser Pro Ala Leu Pro Pro Gin Leu Lye Glu Met Lys Ser Gin 
35 40 45 

Glu Ser Ala Ali Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser 
50 55 60 

Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn 
65 70 75 

Glu Leu Asn Arg Lys Asn Lys Pro Gin Asn lie Lys He Gin Lys 
80 85 90 

Lys Pro Gly Lys Ser Glu L«u Arg He Asn Lys Ala Ser Leu Ala 
95 100 105 

Asp Ser Gly Glu Tyr Met Cys Lys Val He Ser Lys Leu Gly Asn 
110 115 120 

Asp Ser Ala Ser Ala Asn He Thr He Val Glu Ser Asn Glu He 
125 130 135 

He Thr Gly Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser 
140 145 150 

Glu Ser Pro He Arg He Ser Val Ser Thr Glu Gly Ala Asn Hhr 
155 160 165 

Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Val 
170 175 180 

Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Acn Gl^ Gly Glu 
185 190 195 

Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys 
200 205 210 

Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gin Asn Tyr Val 
215 220 225 

Met Ala Ser Phe Tyr Lys Ala Glu Glu Leu Tyr Gin Lys Arg Val 
230 235 240 

Leu Thr He Thr Gly He Cys He Ala Leu Leu Val Val Gly He 
245 250 255 

Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gin Arg Lys Lys 
260 265 270 

Leu His Asp Arg Leu Arg Gin Ser Leu Arg Ser Glu Arg AtfTi Asn 
275 280 285 

Met Met Asn He Ala Asn Gly Pro His His Pro Asn Pro Pro Pro 
290 295 300 



71 



EP1 114 863 A2 



10 



20 



25 



40 



50 



55 



Clu Asn Val Gin Leu Val Asn Gin Tyr Val Ser hys Asn Val lie 
305 310 315 

Ser Ser Glu His lie Val Glu Arg Glu Ala Clu Thr Ser Phe Ser 
320 325 330 

Thr Ser His Tyr Thr Ser Thr Ala His His Ser Thr Thr Val Thr 
335 340 345 

Gin Ttor Pro Ser His Ser Trp Ser Asn Gly His Thr Glu Ser lie 
350 355 360 

Leu Ser Glu Ser His Ser Val lie Val Met Ser Ser Val Glu Asn 
365 370 375 

S«r Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg Leu Asn 
380 385 390 

Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg His Ala 
395 400 405 

Arg Glu Tbx Pro Asp Ser Tyr Arg Asp Ser Pro His Ser Glu Arg 
410 415 420 

Tyr VaJL Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val Asp 
425 430 435 

Phe His Thx Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser 
440 445 450 

Pro Pro Val ser Ser Het Thr Val Ser Lys Pro Ser Met Ala Va.1 
455 460 465 

Ser Pro Phe Met Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro 
470 475 480 

Pro Arg Leu Arg Glu Lys Lys Phe Asp His His. Pro Gin Gin Phe 
485 490 495 

Ser Ser Phe His His Asn Pro Ala His Asp Ser Asu Ser Leu Pro 
500 505 510 

Ala Ser Pro Leu Arg lie Val Glu Asp Glu Glu lyr Glu Thx Thr 
515 520 525 

• « 

Gin Glu Tyr Glu Pro Ala Gin Glu Pro VaJ Lys Lys Leu Ala Asn 
530 535 540 

Ser Arg Arg Ala Lys Arg Thr Lys Pro Asn Gly His lie Ala Asn 
545 550 555 

Arg Leu Glu Val Asp Ser Asn Thr Ser Ser Gin Ser Ser Asn Ser 
560 565 570 

Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu Asp Thr Pro Phe 
575 580 585 

Leu Gly lie Gin Asn Pro I^u Ala Ala Ser Leu Glu Ala Thr^Pro 
590 595 600 

Ala Phe Arg Leu Ala Asp Ser Arg Thr Asn Pro Ala Gly Arg Phe 
605 610 615 
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Ser Thr Gin Glu Glu He Gin Ala Arg Leu Ser Ser Val Il« Ala 
€20 625 630 

Asn Gin Asp Pro He Ala Val 
635 637 

(2) INFORMATION FOR SEQ ID NO: 29: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 420 amino acids 

(B) TVPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Mot Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys 
15 10 IS 

Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala Ala Gly Ser 
20 25 30 

Gin Ser Pro Ala Lou Pro Pro Gin IfCU Lys Glu Met Lys Ser Gin 
35 40 45 

Glu Ser Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser 
50 55 60 

Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn 
65 70 75 

Glu Leu Asn Arg Lys Asn Lys Pro Gin Asn He Lys He Gin Lys 
80 85 90 

Lys Pro Gly Lys Ser Glu Leu Arg He Asn Lys Ala Ser Leu Ala 
95 100 105 

Asp Ser Gly Glu Tyr Met Cys Lys Val He Ser Lys Leu Gly Asn 
110 115 120 

Asp Ser Ala Ser Ala Asn He Thr He Val Glu Ser Asn Glu He 
125 130 135 

He Thr Gly Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser 
140 145 ISO 

Glu Ser Pro He Arg Ho Ser Val Ser Thr Glu Gly Ala Asn Air 
155 160 165 

Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Val 
170 175 180 

45 Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu 

185 190 195 

Cys Phe Wot Val Lys Asp L«u Ser Asn Pro Ser Arg Tyr Leu Cys 
200 205 210 

50 Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gin Asn Tyr Val 

215 220 225 

Met Ala Ser Phe Tyr Lys Ala Glu Glu Leu Tyr Gin Lys Arg Val 
230 235 240 

55 
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Leu Thr lie Thr Gly He Cys He JUa Leu Leu Val Val Cly He 
245 250 255 

Met Cys Val Val Ala Tyr Cye Lys Thr Lys Lys Gin Arg Lys Lys 
250 265 270 

Leu His Asp Arg I^u Arg Gin Ser Leu Arg Ser Glu Arg Asn Asn 
275 280 285 

Met Met Asn He Ala Asn Gly Pro His His Pro Asn Pro Pro Pro 
290 295 300 

Glu Asn Val Gin Leu Val Asn Gin Tyr Val Ser Lys Asn Val He 
305 310 315 

Ser Ser Glu His He Val Glu Arg Glu Ala Glu Thr Ser Phe Ser 
320 325 330 

Thr Ser His Tyr Thr Ser Thr Ala His His Ser Tftir Thr Val Thr 
335 340 3CS 

Gin Thr Pro Ser His Ser Trp Ser Asn Gly His Thr Glu Ser He 
350 355 350 

Leu Ser Glu Ser His Ser Val He Val Met Ser Ser Val Glu Asn 
365 370 375 

Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg Leu Asn 
380 385 390 

Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg Hie Ala 
395 400 405 



Arg Glu ©ir Pro Asp Ser Tyr Arg Asp Ser Pro His Ser Glu Arg 
410 415 420 



(2) INFORMATION FOR SEQ ID NOi30: 

(i) SEQOENCE CHARACTERISTICS: 

(A) LENGTH: 241 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xi) SEQOENCE OESGRIPTION: SEQ ID NO:30: 

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys 
1 5 10 15 

Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala Ala Gly Ser 
20 25 30 

Gin Ser Pro Ala Leu Pro Pro Gin Leu Lys Glu Met Lys Ser Gin 
3 5 40 45 

Glu Ser Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser 
£0 55 60 

Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn 
o5 70 75 

Glu Leu Asn Arg Lvs Asn Lys Pro Gin Asn He Lys He Gin Lys 

85 9*: 
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Lys Pro Gly Lys Ser Glu Leu Arg He Asn Lys Ala Ser Leu Ala 
95 100 105 

Asp Ser Gly Glu Tyr Met Cys Lys Val He Ser Lys Leu Gly Asn 
110 115 120 

Asp Ser Ala Ser Ala Asn He Thr He Val Glu Ser Asn Glu He 
125 130 135 

He Thr Gly Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser 
140 145 150 

Glu Ser Pro He Arg He Ser Val Ser Thr Glu Gly Ala Asn Thr 
155 160 165 

Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Val 
170 175 180 

Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu 
185 190 195 

Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg T/r Leu Cys 
200 205 210 

Lys Cys Pro Asn Glu Phe TtiT Gly Asp Arg Cys Gin Asn Tyr Val 
215 220 225 

Met Ala Ser Phe Tyr Ser Thr Ser Thr Pro Phe Leu Ser Leu Pro 
230 235 240 

Glu 
241 



Claims 

1 . An isolated antibody directed against the extra-cellular domain of p1 85"^f^ which antagonizes heregulin (HRG). 

2. A composition comprising isolated heregulin polypeptide. 

3. The composition of claim 2 wherein the heregulin is antigenically active. 
40 4. The composition of claim 2 wherein the heregulin is biologically active. 

5. The composition of claim 4 wherein the heregulin is HRG-GFD. 

6. The composition of claim 2 wherein the heregulin is heregulin -a, -pi . -^2, or -^3. 

45 

7. The composition of claim 4 wherein the heregulin is human heregulin-a-GFD, 

8. The composition of claim 4 wherein the heregulin is human heregulin-pi-GFD, heregulin-p2-GFD or heregulin- 
P3-GFD. 

50 

9. The composition of claim 2 further comprising phamnaceutically acceptable carrier. 

10. The composition of claim 9 wherein the heregulin is a heregulin GFD. 
55 11. The composition of claim 1 0 further comprising an immune adjuvant. 

12. The composition of claim 11 wherein the heregulin GFD comprises an immunogenic, non-heregultn polypeptide. 
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13. The composition of claim 2 wherein the heregulln is NTD-GFD. 

14. The composition of clainn 2 wherein the heregulin is hJTD-GFD-transmembrane polypeptide. 
5 15. The composition of claim 2 wherein the heregulin is HRG-GFD. 

16. The composition of claim 2 wherein the heregulin comprises a cytoplasmic domain. 

17. The composition of claim 2 wherein the heregulin is NTD-GFD and it has an amino acid sequence which is at least 
10 85% homologous with the native heregulin-a, -pi . -p2, -p3 NTD-GFD sequence. 

18. The composition of claim 2 wherein the heregulin polypeptide comprises an enzyme. 

19. The composition of claim 1 7 wherein the heregulin is HRG-a. 

15 

20. The composition of claim 19 wherein the heregulin-a has an amino acid substituted, deleted or inserted adjacent 
to anyone of residues 1-23, 107-108,121-123, 128-130 and 163-247 (Fig. 15). 

21. The composition of claim 1 7 wherein the heregulin is HRG-p^. 

20 

22. The composition of claim 21 wherein the heregulin pi has an amino acid substituted, deleted or inserted adjacent 
to residues 1-23,107-108. 121-123, 128-130 and 163-252 (Fig. 15). 

23. The composition of claim 17 wherein the heregulin is HRG-p2. 

25 

24. The composition of claim 23 wherein the heregulin p2 has an amino acid substituted, deleted or inserted adjacent 
to anyone of residues 123, 107-108, 121-123, 128-130 and 163-244 (Fig. 15). 

25. The composition of claim 1 7 wherein the heregulin is HRG-P3. 

30 

26. The composition of claim 25 wherein the heregulin P3 has an amino acid substituted, deleted or inserted adjacent 
to any one of residues 1-23,107-108, 121-123, 128-130 and 163-241 (Fig. 15). 

27. An isolated antibody that is capable of binding a heregulin polypeptide. 

35 

28. The isolated antibody of claim 27 that is capable of binding specifically to a heregulin-a, heregulin-p1 , hereguiin- 
p2, or heregulin-p3. 

29. Isolated heregulin encoding nucleic add. 

40 

30. The nucleic acid of claim 29 which encodes heregulin-a, heregulin-p1, heregulin-p2, or heregulin-p3 polypeptide. 

31. The nucleic acid of claim 29 that encodes a heregulin-GFD. 
45 32. An expression vector comprising the nucleic acid of claim 29. 

33. The expression vector of claim 32 wherein the nucleic acid encodes a heregulin-GFD. 

34. A host cell transformed with a vector of claim 32. 

so 

35. A method comprising culturing the host cell of claim 34 to express the heregulin and recovering the heregulin from 
the host cell. 

36. The method of claim 35 wherein the heregulin is heregulin-a, heregulin-p1, heregulin p2, or heregulin-p3. 

55 

37. The method of claim 35 wherein the heregulin is heregulin-NTD-GFD. 

38. The method of claim 35 wherein the heregulin is heregulin-GFD. 
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39. A method of determining the presence of a heregulin nucleic acid, comprising contacting the nucleic acid of claim 
29 with a test sample nucleic acid and determining whether hybridization has occurred. 

40. A method of amplifying a nucleic acid test sample comprising priming a nucleic acid polymerase chain reaction 
5 with the nucleic acid of claim 29. 

41. A method for purifying a heregulin comprising adsorbing heregulin from a contaminated solution thereof onto 
heparin Sepharose or a cation exchange resin. 

10 
15 
20 
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FIG. 3 
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GG GCG CGA GCG CCT CAG CGC GGC CGC TCG CTC TCC CCC 38 
Ala Arg Ala Pro Gin Arg Gly Arg Ser Leu Ser Pro 
15 10 

TCG AGG GAC AAA CTT TTC CCA AAC CCG ATC CGA GCC CTT 77 
Ser Arg Asp Lys Leu Phe Pro Asn Pro He Arg Ala Leu 
15 20 25 

GGA CCA AAC TCG CCT GCG CCG AGA GCC GTC CGC GTA GAG 116 
Gly Pro Asn Ser Pro Ala Pro Arg Ala Val Arg Val Glu 
30 35 

CGC TCC GTC TCC GGC GAG ATG TCC GAG CGC AAA GAA GGC 155 
Arg Ser Val Ser Gly Glu Met Ser Glu Arg Lys Glu J Gly 
40 45 50 

AGA GGC AAA GGG AAG GGC AAG AAG AAG GAG CGA GGC TCC 194 

Arg Gly Lys Gly Lys Gly Lys Lys Lys Glu Arg Gly Ser 
55 60 

GGC AAG AAG CCG GAG TCC GCG GCG GGC AGC CAG AGC CCA 233 
Gly Lys Lys Pro Glu Ser Ala Ala Gly Ser Gin Ser Pro 
65 70 75 

GCC TTG CCT CCC CGA TTG AAA GAG ATG AAA AGC CAG GAA 272 
Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gin Glu 
80 85 90 

TCG GCT GCA GGT TCC AAA CTA GTC CTT CGG TGT GAA ACC 311 
Ser Ala Ala Gly ser Lys Leu Val Leu Arg Cys Glu Thr 
95 100 

AGT TCT GAA TAC TCC TCT CTC AGA TTC AAG TGG TTC AAG 350 
Ser Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys 
105 110 1^5. 

AAT GGG AAT GAA TTG AAT CGA AAA AAC AAA CCA CAA AAT 389 
Asn Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro Gin Asn 
120 125 

ATC AAG ATA CAA AAA AAG CCA GGG AAG TCA GAA CTT CGC 428 
He Lys He Gin Lys Lys Pro Gly Lys Ser Glu Leu Arg 
130 135 140 

ATT AAC AAA GCA TCA CTG GCT GAT TCT GGA GAG TAT ATG 467 
He Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met 
145 150 155 

TGC AAA GTG ATC AGC AAA TTA GGA AAT GAC AGT GCC TCT 50f 
Cys Lys val He Ser Lys Leu Gly Asn Asp Ser Ala Ser 

FIG 4A "° 
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GCC AAT ATC ACC ATC GTG GAA TCA AAC GAG ATC ATC ACT 545 
Ala Asn He Thr He Val Glu Ser Asn Glu lie He Thr 
170 175 180 

GGT ATC CCA GCC TCA ACT GAA GGA GCA TAT GTG TCT TCA 584 
Gly Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser 
185 190 

GAG TCT CCC ATT AG A ATA TCA GTA TCC ACA GAA GGA GCA 623 
Glu Ser Pro He Arg He Ser Val Ser Thr Glu Gly Ala 
195 200 205 

AAT ACT TCT TCA TCT ACA TCT ACA TCC ACC ACT GGG ACA 662 
Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr 
210 215 220 

AGC CAT CTT GTA AAA TGT GCG GAG AAG GAG AAA ACT TTC 701 
Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe 
225 230 

TGT GTG AAT GGA GGG GAG TGC TTC ATG GTG AAA GAC CTT 740 
Cys Val Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu 
235 240 245 

TCA AAC CCC TCG AGA TAC TTG TGC AAG TGC CAA CCT GGA 779 
Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys Gin Pro Gly 
250 255 

TTC ACT GGA GCA AGA TGT ACT GAG AAT GTG CCC ATG AAA 818 
Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro Met Lys 
260 265 270 

GTC CAA AAC CAA GAA AAG GCG GAG GAG CTG TAC CAG AAG 857 
Val Gin Asn Gin Glu Lys Ala Glu Glu Leu Tyr Gin Lys 
275 280 , ^ 285 

AGA GTG CTG ACC ATA ACC GGC ATC TGC ATC GCC CTC CTT 896 
Arg Val Leu Thr He Thr Gly He Cys He Ala Leu Leu 
290 295 

GTG GTC GGC ATC ATG TGT GTG GTG GCC TAC TGC AAA ACC 935 
Val Val Gly He Met Cys Val Val Ala Tyr Cys Lys Thr 
300 305 310 

AAG AAA CAG CGG AAA AAG CTG CAT GAC CGT CTT CGG CAG 974 
Lys Lys Gin Arg Lys Lys Leu His Asp Arg Leu Arg Gin 
315 320 

AGC CTT CGG TCT GAA CGA AAC AAT ATG ATG AAC ATT GCC 1013 
Ser Leu Arg Ser Glu Arg Asn Asn Met Met Asn He Ala 
325 330 335 p|Q 43 
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AAT GGG CCT CAC CAT CCT AAC CCA CCC CCC GAG AAT GTC 1052 
Asn Gly Pro His His Pro Asn Pro Pro Pro Glu Asn Val 
340 345 350 

CAG CTG GTG AAT CAA TAG GTA TCT AAA AAC GTC ATC TCC 1091 
Gin Leu Val Asn Gin Tyr Val Ser Lys Asn Val lie Ser 
355 360 

AGT GAG CAT ATT GTT GAG AGA GAA GCA GAG AC A TCC TTT 1130 
Ser Glu His lie Val Glu Arg Glu Ala Glu Thr Ser Phe 
365 370 375 

TCC ACC AGT CAC TAT ACT TCC ACA GCC CAT CAC TCC ACT 1169 
Ser Thr Ser His Tyr Thr Ser Thr Ala His His Ser Thr 
380 385 

ACT GTC ACC CAG ACT CCT AGC CAC AGC TGG AGC AAC GGA 1208 
Thr Val Thr Gin Thr Pro Ser His Ser Trp Ser Asn Gly 
390 395 400 

CAC ACT GAA AGC ATC CTT TCC GAA AGC CAC TCT GTA ATC 1247 
His Thr Glu Ser lie Leu Ser Glu Ser His Ser Val lie 
405 410 415 

GTG ATG TCA TCC GTA GAA AAC AGT AGG CAC AGC AGC CCA 1286 
Val Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro 
420 425 

ACT GGG GGC CCA AGA GGA CGT CTT AAT GGC ACA GGA GGC 1325 
Thr Gly Gly Pro Arg Gly Arg Leu Asn Gly Thr Gly Gly 
430 435 440 

CCT CGT GAA TGT AAC AGC TTC CTC AGG CAT GCC AGA GAA 13 64 
Pro Arg Glu Cys Asn Ser Phe Leu Arg His Ala Arg Glu 
445 450 

ACC CCT GAT TCC TAC CGA GAC TCT CCT CAT AGT GAA AGG 1403 
Thr Pro Asp Ser Tyr Arg Asp Ser Pro His Ser Glu Arg 
455 460 465 

TAT GTG TCA GCC ATG ACC ACC CCG OCT CGT ATG TCA CCT 1442 
Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro 
470 475 480 

GTA GAT TTC CAC ACG CCA AGC TCC CCC AAA TCG CCC CCT 14 81 
Val Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro 
485 490 

TCG GAA ATG TCT CCA CCC GTG TCC AGC ATG ACG GTG TCC 1520 
S«r Glu Met: Ser Pro Pro Val Ser Ser Met Thr Val Ser 
4S5 500 505 pjQ 
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ATG CCT TCC ATG GCG GTC AGC CCC TTC ATG GAA GAA GAG 1559 
Met Pro Ser Met Ala Val Ser pro Phe Met Glu Glu Glu 
510 515 

AG A CCT CTA CTT CTC GTG AC A CCA CCA AGG CTG CGG GAG 1598 
Arg Pro Leu Leu Leu Val Thr Pro Pro Arg Leu Arg Glu 
520 525 530 

AAG AAG TTT GAC CAT CAC CCT CAG CAG TTC AGC TCC TTC 1637 
Lys Lys Phe Asp His His Pro Gin Gin Phe Ser Ser Phe 
535 540 545 

CAC CAC AAC CCC GCG CAT GAC AGT AAC AGC CTC CCT GCT 1676 
His His Asn Pro Ala His Asp Ser Asn Ser Leu Pro Ala 
550 555 

AGC CCC TTG AGG ATA GTG GAG GAT GAG GAG TAT GAA ACG 1715 
Ser Pro Leu Arg lie Val Glu Asp Glu Glu Tyr Glu Thr 
560 565 570 

ACC CAA GAG TAC GAG CCA GCC CAA GAG CCT GTT AAG AAA 1754 
Thr Gin Glu Tyr Glu Pro Ala Gin Glu Pro Val Lys Lys 
575 580 

CTC GCC AAT AGC CGG CGG GCC AAA AGA ACC AAG CCC AAT 1793 
Leu Ala Asn Ser Arg Arg Ala Lys Arg Thr Lys Pro Asn 
585 590 595 

GGC CAC ATT GCT AAC AGA TTG GAA GTG GAC AGC AAC ACA 1832 
Gly His lie Ala Asn Arg Leu Glu Val Asp Ser Asn Thr 
600 605 610 

AGC TCC CAG AGC AGT AAC TCA GAG AGT GAA ACA GAA GAT 1871 
Ser Ser Gin Ser Ser Asn Ser Glu Ser Glu Thr Glu Asp 
615 620 

GAA AGA GTA GGT GAA GAT ACG CCT TTC CTG GGC ATA CAG 1910 
Glu Arg Val Gly Glu Asp Thr Pro Phe Leu Gly lie Gin 
625 630 635 

AAC CCC CTG GCA GCC AGT CTT GAG GCA ACA CCT GCC TTC 1949 
Asn Pro Leu Ala Ala Ser Leu Glu Ala Thr Pro Ala Phe 
640 645 

CGC CTG GCT GAC AGC AGG ACT AAC CCA GCA GGC CGC TTC 1988 
Arg Leu Ala Asp Ser Arg Thr Asn Pro Ala Gly Arg Phe 
650 655 -660 

TC3 ACA CAG GAA GAA ATC CAG G 2010 
Ser Thr Gin Glu Glu lie Gin F\G 4D 

665 669 
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CELL GROWTH STIMULATION BY HEREGULIN 2-alpha 




CONTROL SKBR-3 MCF-7 MB-468 



FIG. 7 
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GG- GAC AAA CTT TTC CCA AAC CCG ATC CGA GCC CTT GGA 38 
Asp Lys Leu Phe Pro Asn Pro lie Arg Ala Leu Gly 
15 10 

CCA AAC TCG CCT GCG CCG AGA GCC GTC CGC GTA GAG CGC 77 
Pro Asn Ser Pro Ala Pro Arg Ala Val Arg Val Glu Arg 
15 20 25 

TCC GTC TCC GGC GAG ATG TCC GAG CGC AAA GAA GGC AGA 116 
Ser Val Ser Gly Glu Met Ser Glu Arg Lys Glu Gly Arg 
30 35 

GGC AAA GGG AAG GGC AAG AAG AAG GAG CGA GGC TCC GGC 155 
Gly Lys Gly Lys Gly Lys Lys Lys Glu Arg Gly Ser Gly 
40 45 50 

AAG AAG CCG GAG TCC GCG GCG GGC AGC CAG AGC CCA GCC 194 
Lys Lys Pro Glu Ser Ala Ala Gly Ser Gin Ser Pro Ala 
55 60 

TTG CCT CCC CAA TTG AAA GAG ATG AAA AGC CAG GAA TCG 233 
Leu Pro Pro Gin Leu Lys Glu Met Lys Ser Gin Glu Ser 
65 70 75 

GCT GCA GGT TCC AAA CTA GTC CTT CGG TGT GAA ACC AGT 272 
Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser 
80 85 90 

TCT GAA TAC TCC TCT CTC AGA TTC AAG TGG TTC AAG AAT 311 
Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn 
95 100 

GGG AAT GAA TTG AAT CGA AAA AAC AAA CCA CAA AAT ATC 350 
Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro Gin Asn lie 
105 110 115 

AAG ATA CAA AAA AAG CCA GGG AAG TCA GAA CTT CGC ATT 389 
Lys lie Gin Lys Lys Pro Gly Lys Ser Glu Leu Arg lie 
120 125 

AAC AAA GCA TCA CTG GCT GAT TCT GGA GAG TAT ATG TGC 428 
Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys 
130 135 140 

AAA GTG ATC AGC AAA TTA GGA AAT GAC AGT GCC TCT GCC 467 
Lys Val lie Ser Lvs Leu Gly Asn Asp Ser Ala Ser Ala 
145 " 150 155 

AAT ATC ACC ATC GTG GAA TCA AAC GAG ATC ATC ACT GGT 506 
Asn He Thr He Val Glu Ser Asn Glu He He Thr Gly 
160 165 



FIG. 8A 
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ATG' CCA GCC TCA ACT GAA tSGA GCA TAT GTG TCT TCA GAG 545 
Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser Glu 
170 175 180 

TCT CCC ATT AGA ATA TCA GTA TCC AC A GAA GGA GCA AAT 584 
Ser Pro lie Arg lie Ser Val Ser Thr Glu Gly Ala Asn 
185 190 

ACT TCT TCA TCT ACA TCT ACA TCC ACC ACT GGG ACA AGC 623 
Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser 
195 200 205 

CAT CTT GTA AAA TOT GCG GAG AAG GAG AAA ACT TTC TGT 662 
His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys 
210 215 220 

GTG AAT GGA GGG GAG TGC TTC ATG GTG AAA GAC CTT TCA 701 
Val Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser 
225 230 

AAC CCC TCG AGA TAC TTG TGC AAG TGC CCA AAT GAG TTT 740 
Asn Pro Ser Arg Tyr Leu Cys Lys Cys Pro Asn Glu Phe 
235 240 245 

ACT GGT GAT CGC TGC CAA AAC TAC GTA ATG GCC AGC TTC 779 
Thr Gly Asp Arg Cys Gin Asn Tyr Val Met Ala Ser Phe 
250 255 

TAC AAG CAT CTT GGG ATT GAA TTT ATG GAG GCG GAG GAG 818 
Tyr Lys His Leu Gly lie Glu Phe Met Glu Ala Glu Glu 
260 265 270 

CTG TAC CAG AAG AGA GTG CTG ACC ATA ACC GGC ATC TGC 857 
Leu Tyr Gin Lys Arg Val Leu Thr lie Thr Gly lie Cys 
275 280 285 

ATC GCC CTC CTT GTG GTC GGC ATC ATG TGT GTG GTG GCC 896 
He Ala Leu Leu Val Val Gly He Met Cys Val Val Ala 
290 295 

TAC TGC AAA ACC AAG AAA CAG CGG AAA AAG CTG CAT GAC 935 
Tyr Cys Lys Thr Lys Lys Gin Arg Lys Lys Leu His Asp 
300 305 310 

CGT CTT CGG CAG AGC CTT CGG TCT GAA CGA AAC AAT ATG 974 
Arg Leu Arg Gin Ser Leu Arg Ser Glu Arg Asn Asn Met 
315 320 

ATG AAC ATT GCC AAT GGG CCT CAC CAT CCT AAC CCA CCC 1013 
Mer Asn He Ala Asn Gly Pro His His Pro Asn Pro Pro 
325 330 335 

FIG. 8B 
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CCC 'GAG AAT GTC CAG CTG GTG AAT CAA TAG GTA TCT AAA 1052 
Pro Glu Asn Val Gin Leu Val Asn Gin Tyr Val Ser Lys 
340 345 350 

AAC GTC ATC TCC AGT GAG CAT ATT GTT GAG AGA GAA GCA 1091 
Asn Val lie Ser Ser Glu His lie Val Glu Arg Glu Ala 
355 360 

GAG ACA TCC TTT TCC ACC AGT CAC TAT ACT TCC ACA GCC 1130 
Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala 
365 370 375 

CAT CAC TCC ACT ACT GTC ACC CAG ACT CCT AGC CAC AGC 1169 
His His Ser Thr Thr Val Thr Gin Thr Pro Ser His Ser 
380 385 

TGG AGC AAC GGA CAC ACT GAA AGC ATC CTT TCC GAA AGC 1208 
Trp Ser Asn Gly His Thr Glu Ser lie Leu Ser Glu Ser 
390 395 400 

CAC TCT GTA ATC GTG ATG TCA TCC GTA GAA AAC AGT AGG 1247 
His Ser val lie Val Met Ser Ser val Glu Asn Ser Arg 
405 410 415 

CAC AGC AGC CCA ACT GGG GGC CCA AGA GGA CGT CTT AAT 1286 
His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg Leu Asn 
420 425 

GGC ACA GGA GGC CCT CGT GAA TGT AAC AGC TTC CTC AGG 1325 
Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg 
430 435 440 

CAT GCC AGA GAA ACC CCT GAT TCC TAC CGA GAC TCT CCT 1364 
His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro 
445 450 

• ■ 

CAT AGT GAA AGG TAT GTG TCA GCC ATG ACC ACC CCG GCT 1403 
His Ser Glu Arg Tyr Val Ser Ala Met Thr Thr Pro Ala 
455 460 465 

CGT ATG TCA CCT GTA GAT TTC CAC ACG CCA AGC TCC CCC 1442 
Arg Met Ser Pro Val Asp Phe His Thr Pro Ser Ser Pro 
470 475 480 

AAA TCG CCC CCT TCG GAA ATG TCT CCA CCC GTG TCC AGC 1481 
Lys Ser Pro Pro Ser Glu Met Ser Pro Pro val Ser Ser 
485 490 

ATG ACG GTG TCC ATG CCT TCC ATG GCG GTC AGC CCC TTC 1520 
Met Thr val Ser Met Pro Ser Met Ala Val Ser Pro Phe 
495 500 505 

FIG. 8C 
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ATG GAA GAA GAG AGA CCT CTA CTT CTC GTG ACA CCA CCA 1559 
Met Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro 
510 515 

AGG CTG CGG GAG AAG AAG TTT GAC CAT CAC CCT CAG CAG 1598 
Arg Leu Arg Glu Lys Lys Phe Asp His His Pro Gin Gin 
520 525 530 

TTC AGC TCC TTC CAC CAC AAC CCC GCG CAT GAC AGT AAC 1637 
Phe Ser Ser Phe His His Asn Pro Ala His Asp Ser Asn 
535 540 545 

AGC CTC CCT GCT AGC CCC TTG AGG ATA GTG GAG GAT GAG 1676 
Ser Leu Pro Ala Ser Pro Leu Arg lie Val Glu Asp Glu 
550 555 

GAG TAT GAA ACG ACC CAA GAG TAC GAG CCA GCC CAA GAG 1715 
Glu Tyr Glu Thr Thr Gin Glu Tyr Glu Pro Ala Gin Glu 
560 565 570 

CCT GTT AAG AAA CTC GCC AAT AGC CGG CGG GCC AAA AGA 1754 
Pro Val Lys Lys Leu Ala Asn Ser Arg Arg Ala Lys Arg 
575 580 

ACC AAG CCC AAT GGC CAC ATT GCT AAC AGA TTG GAA GTG 1793 
Thr Lys Pro Asn Gly His lie Ala Asn Arg Leu Glu Val 
585 590 595 

GAC AGC AAC ACA AGC TCC CAG AGC AGT AAC TCA GAG AGT 1832 
Asp Ser Asn Thr Ser Ser Gin Ser Ser Asn Ser Glu Ser 
600 605 610 

GAA ACA GAA GAT GAA AGA GTA GGT GAA GAT ACG CCT TTC 1871 
Glu Thr Glu Asp Glu Arg Val Gly Glu Asp Thr Pro Phe 
615 620 

CTG GGC ATA CAG AAC CCC CTG GCA GCC AGT CTT GAG -GCA 1910 
Leu Gly lie Gin Asn Pro Leu Ala Ala Ser Leu Glu Ala 
625 630 635 

ACA CCT GCC TTC CGC CTG GCT GAC AGC AGG ACT AAC CCA 1949 
Thr Pro Ala Phe Arg Leu Ala Asp Ser Arg Thr Asn Pro 
640 645 

GCA GGC CGC TTC TCG ACA CAG GAA GAA ATC CAG GCC AGG 1988 
Ala Gly Arg Phe Ser Thr Gin Glu Glu He Gin Ala Arg 
650 655 660 - 

CTG TCT AGT GTA ATT GCT AAC CAA GAC CCT ATT GCT GTA TA 202E 
Leu Ser Ser Val He Ala Asn Gin Asp Pro He Ala Val 
665 670 675 



FIG. 8D 
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AA AGA GCC GGC GAG GAG TTC CCC GAA ACT TGT TGG AAC 38 
Arg Ala Gly Glu Glu Phe Pro Glu Thr Cys Trp Asn 
15 10 

TCC GGG CTC GCG CGG AGG CCA GGA GCT GAG CGG CGG CGG 77 
Ser Gly Leu Ala Arg Arg Pro Gly Ala Glu Arg Arg Arg 
15 20 25 

CTG CCG GAC GAT GGG AGC GTG AGC AGG ACG GTG ATA ACC 116 
Leu Pro Asp Asp Gly Ser Val Ser Arg Thr Val lie Thr 
30 35 

TCT CCC CGA TCG GGT TGC GAG GGC GCC GGG CAG AGG CCA 155 
Ser Pro Arg Ser Gly Cys Glu Gly Ala Gly Gin Arg Pro 
40 45 50 

GGA CGC GAG CCG CCA GCG GTG GGA CCC ATC GAC GAC TTC 194 
Gly Arg Glu Pro Pro Ala Val Gly Pro lie Asp Asp Phe 
55 60 

CCG GGG CGA CAG GAG CAG CCC CGA GAG CCA GGG CGA GCG 233 
Pro Gly Arg Gin Glu Gin Pro Arg Glu Pro Gly Arg Ala 
65 70 75 

CCC GTT CCA GGT GGC QGG ACC GCC CGC CGC GTC CGC GCC 272 
Pro val Pro Gly Gly Arg Thr Ala Arg Arg Val Arg Ala 
80 85 90 

GCG CTC CCT GCA GGC AAC GGG AGA CGC CCC CGC GCA GCG 311 
Ala Leu Pro Ala Gly Asn Gly Arg Arg Pro Arg Ala Ala 
95 100 

CGA GCG CCT CAG CGC GGC CGC TCG CTC TCC CCC TCG AGG 350 
Arg Ala Pro Gin Arg Gly Arg Ser Leu Ser Pro Ser Arg 
105 110 115 

GAC AAA CTT TTC CCA AAC CCG ATC CGA GCC CTT GGA CCA 389 
Asp Lys Leu Phe Pro Asn Pro lie Arg Ala Leu Gly Pro 
120 125 

AAC TCG CCT GCG CCG AGA GCC GTC CGC GTA GAG CGC TCC 428 
Asn Ser Pro Ala Pro Arg Ala Val Arg Val Glu Arg Ser 
130 135 140 

GTC TCC GGC GAG ATG TCC GAG CGC AAA GAA GGC AGA GGC 467 
Val Ser Gly -Glu Met Ser Glu Arg Lys Glu Gly Arg Gly 
145 150 155 

AAA GGG AAG GGC AAG AAG AAG GAG CGA GG 496 
Lys Gly Lys Gly Lys Lys Lys Glu Arg 
160 164 



FIG. 1 1 
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GTGGCTGCGG GGCAATTGAA AAAGAGCCGG CGAGGAGTTC CCCGAAACTT SO 
GTTGGAACTC CGGGCTCGCG CGGAGGCCAG GAGCTGAGCG GCGGCGGCTG 100 
CCGGACGATG GGAGCGTGAG CAGGACGGTG ATAACCTCTC CCCGATCGGG 150 
TTGCGAGGGC GCCGGGCAGA GGCCAGGACG CGAGCCGCCA GCGGCGGGAC 200 
CCATCGACGA CTTCCCGGGG CGACAGGAGC AGCCCCGAGA GCCAGGGCGA 250 
GCGCCCGTTC CAGGTGGCCG GACCGCCCGC CGCGTCCGCG CCGCGCTC:^C 300 
TGCAGGCAAC GGGAGACGCC CCCGCGCAGC GCGAGCGCCT CAGCGCGGCC 350 
GCTCGCTCTC CCCATCGAGG GACAAACTTT TCCCAAACCC GATCCGAGCC 400 
CTTGGACCAA ACTCGCCTGC GCCGAGAGCC GTCCGCGTAG AGCGCTCCGT 450 



CTCCGGCGAG ATG TCC GAG CGC AAA GAA GGC AGA GGC AAA 4 90 
Met Ser Glu Arg Lys Glu Gly Arg Gly Lys 
15 10 

GGG AAG GGC AAG AAG AAG GAG CGA GGC TCC GGC AAG AAG 529 
Gly Lys Gly Lys Lys Lys Glu Arg Gly Ser Gly Lys Lys 
15 20 

CCG GAG TCC GCG GCG GGC AGC GAG AGC CCA GCC TTG CCT 5 €8 
Pro Glu Ser Ala Ala Gly Ser Gin Ser Pro Ala Leu Pro 
25 30 35 - 

CCC CAA TTG AAA GAG ATG AAA AGC GAG GAA TCG GCT GCA 607 
Pro Gin Leu Lys Glu Met Lys Ser Gin Glu Ser Ala Ala 
40 45 

GGT TCC AAA CTA GTC CTT CGG TGT GAA AGC AGT TCT GAA 64 6 
Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu 
50 55 60 

TAC TCC TCT CTC AGA TTG AAG TGG TTG AAG AAT GGG AAT 685 
Tyr Ser Ser Leu Arg Phe Lvs Trp Phe Lys Asn Gly Ash 
65 ' 70 75 

GAA TTG AAT CGA AAA AAG AAA CCA GAA AAT ATG AAG ATA 724 
Giu Leu Asn Arg Lys Asn Lvs Pro Gin Asn lie Lys lie 

FIG. IZA 
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CAA AAA AAG CCA GGG AAG "TCA GAA CTT CGC ATT AAC AAA 763 
Gin Lys Lys Pro Gly Lys Ser Glu lieu Arg lie Asn l*ys 
90 95 100 

GCA TCA CTG GCT GAT TCT GGA GAG TAT ATG TGC AAA GTG 802 
Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cya Lys Val 
105 110 

ATC AGC AAA TTA GGA AAT GAC AGT GCC TCT GCC AAT ATC 841 
lie Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn lie 
115 120 125 

ACC ATC GTG GAA TCA AAC GAG ATC ATC ACT GGT ATG CCA 880 
Thr lie Val Glu Ser Asn Glu lie lie Thr Gly Met Pro 
130 135 140 

GCC TCA ACT GAA GGA GCA TAT GTG TCT TCA GAG TCT CCC 919 
Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser Glu Ser Pro 
145 150 

ATT AGA ATA TCA GTA TCC ACA GAA GGA GCA AAT ACT TCT 958 
lie Arg He Ser Val Ser Thr Glu Gly Ala Asn Thr Ser 
155 160 165 

TCA TCT ACA TCT ACA TCC ACC ACT GGG ACA AGC CAT CTT 997 
Ser Ser Thr Ser Thr Ser Thr Thr dy Thr Ser His Leu 
170 175 

GTA AAA TGT GCG GAG AAG GAG AAA ACT TTC TGT GTG AAT 1036 
Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn 
180 185 190 

GGA GGG GAG TGC TTC ATG GTG AAA GAC CTT TCA AAC CCC 1075 
Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro 
195 200 205 

TCG AGA TAC TTG TGC AAG TGC CCA AAT GAG TTT ACT GGT 1114 
Ser Arg Tyr Leu Cys Lys Cys Pro Asn Glu Phe Ti\r G^y 
210 215 

GAT CGC TGC CAA AAC TAC GTA ATG GCC AGC TTC TAC AAG 1153 
Asp Arg Cys Gin Asn Tyr Val Met Ala Ser Phe Tyr Lys 
220 225 230 

GCG GAG GAG CTG TAC CAG AAG AGA GTG CTG ACC ATA ACC 1192 
Ala Glu Glu Leu Tyr Gin Lys Arg Val Leu Thr II© Thr 
235 240 

GGC ATC TGC ATC GCC CTC CTT GTG GTC GGC ATC ATG" TGT 1231 
Gly He Cys He Ala Leu Leu Val Val Gly He Met Cys 
245 250 255 

GTG GTG GCC TAC TGC AAA ACC AAG AAA CAG CGG AAA AAG 1270 
Val Val Ala Tyr Cys Lys Thr Lys Lys Gin Arg Lys Lys 

FIG. "° 
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CTG CAT GAC CGT CTT CGG "CAG TlGC CTT CGG TCT GAA CGA 1309 
Leu His Asp Arg Leu Arg Gin Ser Leu Arg Ser Glu Arg 
275 280 

AAC AAT ATG ATG AAC ATT GCC AAT GGG CCT CAC CAT CCT 1348 
Asn Asn Met Met Asn lie Ala Asn Gly Pro His His Pro 
285 290 295 

AAC CCA CCC CCC GAG AAT GTC GAG CTG GTG AAT CAA TAG 1387 
Asn Pro Pro Pro Glu Asn Val Gin Leu Val Asn Gin Tyr 
300 305 

GTA TCT AAA AAC GTC ATC TCC AGT GAG CAT ATT GTT GAG 1426 
Val Ser Lys Asn Val lie Ser Ser Glu His lie Val Glu 
310 315 320 

AGA GAA GCA GAG ACA TCC TTT TCC ACC AGT CAC TAT ACT 1465 
Arg Glu Ala Glu Thr Ser Phe Ser Thr Ser His Tyr Thr 
325 330 335 

TCC ACA GCC CAT CAC TCC ACT ACT GTC ACC CAG ACT CCT 1504 
Ser Thr Ala His His Ser Thr Thr Val Thr Gin Thr Pro 
340 345 

AGC CAC AGC TGG AGC AAC GGA CAC ACT GAA AGC ATC CTT 1543 
Ser His Ser Trp Ser Asn Gly His Thr Glu Ser lie Leu 
350 355 360 

TCC GAA AGC CAC TCT GTA ATC GTG ATG TCA TCC GTA GAA 1582 
Ser Glu Ser His Ser Val lie Val Met Ser Ser Val Glu 
365 370 

AAC AGT AGG CAC AGC AGC CCA ACT GGG GGC CCA AGA GGA 1621 
Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly 
375 380 385 

CGT CTT AAT GGC ACA GGA GGC CCT CGT GAA TGT AAC AGC 1660 
Arg Leu Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser 
390 395 400 * 

TTC CTC AGG CAT GCC AGA GAA ACC CCT GAT TCC TAG CGA 1699 
Phe Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg 
405 410 

GAC TCT CCT CAT AGT GAA AGG TAT GTG TCA GCC ATG ACC 1738 
Asp Ser Pro His Ser Glu Arg Tyr Val Ser Ala Met Thr 
415 420 425 

ACC CCG GCT CGT ATG TCA CCT GTA GAT TTC CAC ACG CCA 1777 
Thr Pro Ala Arg Met Ser Pro Val Asp Phe His Thr Pro 
430 435 

AGC TCC CCC AAA TCG CCC CCT TCG GAA ATG TCT CCA CCC 1816 
Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser Pro Pro 
440 445 450 p|Q j^Q 
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GTG TCC AGC ATG ACG GTG fCC AAG CCT TCC ATG GCQ GTC 1855 
Vai Ser Ser Met Thr Val Ser Lys Pro Ser Met Ala Val 
455 460 465 

AGC CCC TTC ATG GAA GAA GAG AGA CCT CTA CTT CTC GTG 1894 
Ser Pro Phe Met Glu Glu Glu Arg Pro Leu Leu hBU Val 
470 475 

ACA CCA CCA AGG CTG CGG GAG AAG AAG TTT GAC CAT CAC 1933 
Thr Pro Pro Arg Leu Arg Glu Lys Lys Phe Asp His His 
480 485 490 

CCT CAG CAG TTC AGC TCC TTC CAC CAC AAC CCC GCG CAT 1972 
Pro Gin Gin Phe Ser Ser Phe His His Asn Pro Ala His 
495 500 

GAC AGT AAC AGC CTC CCT GCT AGC CCC TTG AGG ATA GTG 2011 
Asp Ser Asn Ser Leu Pro Ala Ser Pro Leu Arg lie Val 
505 510 515 

GAG GAT GAG GAG TAT GAA ACG ACC CAA GAG TAG GAG CCA 2050 
Glu Asp Glu Glu Tyr Glu Thr Thr Gin Glu Tyr Glu Pro 
520 525 530 

GCC CAA GAG CCT GTT AAG AAA CTC GCC AAT AGC CGG CGG 2089 
Ala Gin Glu Pro Val Lys Lys Leu Ala Asn Ser Arg Arg 
535 540 

GCC AAA AGA ACC AAG CCC AAT GGC CAC ATT GCT AAC AGA 2128 
Ala Lys Arg Thr Lys Pro Asn Gly His lie Ala Asn Arg 
545 550 555 

TTG GAA GTG GAC AGC AAC ACA AGC TCC CAG AGC AGT AAC 2167 
Leu Glu Val Asp Ser Asn Thr Ser Ser Gin Ser Ser Asn 
560 565 

TCA GAG AGT GAA ACA GAA GAT GAA AGA GTA GGT GAA GAT 2206 
Ser Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu A^p . 
570 575 580 

ACG CCT TTC CTG GGC ATA CAG AAC CCC CTG GCA GCC AGT 2245 
Thr Pro Phe Leu Gly lie Gin Asn Pro Leu Ala Ala Ser 
585 590 595 

CTT GAG GCA ACA CCT GCC TTC CGC CTG GCT GAC AGC AGG 2284 
Leu Glu Ala Thr Pro Ala Phe Arg Leu Ala Asp Ser Arg 
600 605 

ACT AAC CCA GCA GGC CGC TTC TCG ACA CAG GAA GAA ATC 2323 
Thr Asn Pro Ala Gly Arg Phe Ser Thr Gin Glu Glu lie 
610 615 620 

CAG GCC AGG CTG TCT AGT GTA ATT GCT AAC CAA GAC CCT 2362 
Gin Ala Arg Leu Ser Ser Val lie Ala Asn Gin Asp Pro 

FIG. I2D "° 
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ATT GCT GTA TAAAACCTA AATAAACACA TAGATTCACC TGTAAAACTT 2410 
lie Ala Val 
635 637 

TATTTTATAT AATAAAGTAT TCCACCTTAA ATTAAACAAT TTATTTTATT 2460 



TTAGCAGTTC TGCAAATAAA AAAAAAAAAA 2490 



FIG. I2E 
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GCGCCTGCCT CCAACCTGCG GGCGGGAGGT GGGTGGCTGC GGGGCAATTG SO 
AAAAAGAGCC GGCGAGGAGT TCCCCGAAAC TTGTTGGAAC TCCGGGCTCG 100 
CGCGGAGGCC AGGAGCTGAG CGGCGGCGGC TGCCGGACGA TGGGAGCGTG 150 
AGCAGGACGG TGATAACCTC TCCCCGATCG GGTTGCGAGG GCGCCGGGCA 200 
GAGGCCAGGA CGCGAGCCGC CAGCGGCGGG ACCCATCGAC GACTTCCCGG 250 
GGCGACAGGA GCAGCCCCGA GAGCCAGGGC GAGCGCCCGT TCCAGGTGGC 300 
CGGACCGCCC GCCGCGTCCG CGCCGCGCTC CCTGCAGGCA ACGGGAGACG 350 
CCCCCGCGCA GCGCGAGCGC CTCAGCGCGG CCGCTCGCTC TCCCCATCGA 400 
GGGACAAACT TTTCCCAAAC CCGATCCGAG CCCTTGQACC AAACTCGCCT 450 



GCGCCGAGAG CCGTCCGCGT AGAGCGCTCC GTCTCCGGCG AG ATG 495 

Met 
1 

TCC GAG CGC AAA GAA GGC AGA GGC AAA GGG AAG GGC AAG 534 
Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly iys 
5 10 

AAG AAG GAG CGA GGC TCC GGC AAG AAG COG GAG TCC GCG 573 
Lys Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala 
15 20 25 , . 

GCG GGC AGC CAG AGC CCA GCC TTG CCT CCC CAA TTG AAA 612 
Ala Gly Ser Gin Ser Pro Ala Leu Pro Pro Gin Leu Lys 
30 35 40 

GAG ATG AAA AGC CAG GAA TCG GCT GCA GGT TCC AAA CTA 651 
Glu Met Lys Ser Gin Glu Ser Ala Ala Gly Ser Lys Leu 
45 50 

GTC CTT CGG TGT GAA ACC AGT TCT GAA TAC TCC TCT CTC 690 
Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu- 
55 60 65 

AGA TTC AAG TGG TTC AAG AAT GGG AAT GAA TTG AAT CGA 72 S 
Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg 
70 75 

FIG. I3A 
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TIAA AAC AAA CCA CAA AAT ATC AAG ATA CAA AAA AAG CCA 768 
•Lys Asn hys Pro Gin Asn lie hys lie Gin Lys Lya Pro 
80 8S 90 

GGG AAG TCA GAA CTT CGC ATT AAC AAA GCA TCA CTG GCT 807 
Gly Lys Ser Glu Leu Arg lie Asn Lys Ala Ser Leu Ala 
95 100 105 

GAT TCT GGA GAG TAT ATG TGC AAA GTG ATC AGC AAA TTA 84 6 
A3P Ser Gly Glu Tyr Met Cys Lys Val He Ser Lya Leu 
110 115 

GGA AAT GAC AGT GCC TCT GCC AAT ATC ACC ATC GTG GAA 885 
Gly Asn Asp Ser Ala Ser Ala Asn He Thr He Val Glu 
120 125 130 

TCA AAC GAG ATC ATC ACT GGT ATG CCA GCC TCA ACT GAA 924 
Ser Asn Glu He He Thr Gly Met Pro Ala Ser Thr Glu 
135 140 

GGA GCA TAT GTG TCT TCA GAG TCT CCC ATT AGA ATA TCA 963 
Gly Ala Tyr Val Ser Ser Glu Ser Pro He Arg He Ser 
145 150 155 

GTA TCC ACA GAA GGA GCA AAT ACT TCT TCA TCT ACA TCT 1002 
Val Ser Thr Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser 
160 165 170 

ACA TCC ACC ACT GGG ACA AGC CAT CTT GTA AAA TGT GCG 1041 
Thr Ser Thr Thr Gly Thr Ser His Leu Val Lys Cys Ala 
175 180 

GAG AAG GAG AAA ACT TTC TGT GTG AAT GGA GGG GAG TGC 1080 
Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys 
185 190 195 

TTC ATG GTG AAA GAC CTT TCA AAC CCC TCG AGA TAC TTG 1119 
Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Le.u 
200 205 

TGC AAG TGC CCA AAT GAG TTT ACT GGT GAT CGC TGC CAA 1158 
Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gin 
210 215 220 

AAC TAC GTA ATG GCC AGC TTC TAC AGT ACG TCC ACT CCC 1197 
Asn Tyr Val Met Ala Ser Phe Tyr Ser Thr Ser Thr Pro 
225 230 235 

TTT CTG TCT CTG CCT GAA TAGGA GCATGCTCAG TTGGTGCTGC- 1240 
Phe Leu Ser Leu Pro Glu 
240 241 

TTTCTTGTTG CTGCATCTCC CCTCAGATTC CACCTAGAGC TAGATGTGTC 12 90 

FIG. I3B 
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TTACCAGATC TAATATTGAC TGCCTCTGCC TGTCGCATGA GAACATTAAC 1340 
AAAAGCAATT GTATTACTTC CTCTGTTCGC GACTAGTTGG CTCTGAGATA 1390 
CTAATAGGTG TGTGAGGCTC CGGATGTTTC TGGAATTGAT ATTGAATGAT 1440 
GTGATACAAA TTGATAGTCA ATATCAAGCA GTGAAATATG ATAATAAAGG 1490 
CATTTCAAAG TCTCACTTTT ATTGATAAAA TAAAAATCAT TCTACTGAAC 1540 
AGTCCATCTT CTTTATACAA TGACCACATC CTGAAAAGGG TGTTGCTAAG 1590 
CTGTAACCGA TATGCACTTG AAATGATGGT AAGTTAATTT TGATTCAGAA 1640 
TGTGTTATTT GTCACAAATA AACATAATAA AAGGAGTTCA GATGTTTTTC 1690 
TTCATTAACC AAAAAAAAAA AAAAA 1715 

FIG. I3C 
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GAGGCGCCTG CCTCCAACCT GCGGGCGGGA GC3TGGGTGGC TGCGGGGCAA 50 
TTGAAAAAGA GCCGGCGAGG AGTTCCCCGA AACTTGTTGG AACTCCGGGC 100 
TCGCGCGGAG GCCAGGAGCT GAGCGGCGGC GGCTGCCGGA CGATGGGAGC 150 
GTGAGCAGGA CGGTGATAAC CTCTCCCCGA TCGGGTTGCG AGGGCGCCGG 200 
GCAGAGGCCA GGACGCGAGC CGCCAGCGGC GGGACCCATC GACGACTTCC 250 
CGGGGCGACA GGAGCAGCCC CGAGAGCCAG GGCGAGCGCC CGTTCCAGGT 300 
GGCCGGACCG CCCGCCGCGT CCGCGCCGCG CTCCCTGCAG GCAACGGGAG 350 
ACGCCCCCGC GCAGCGCGAG CGCCTCAGCG CGGCCGCTCG CTCTCCCCAT 400 
CGAGGGACAA ACTTTTCCCA AACCCGATCC GAGCCCTTGG ACCAAACTCG 450 



CCTGCGCCGA GAGCCGTCCG CGTAGAGCGC TCCGTCTCCG GCGAG AT 497 

Met 
1 

G TCC GAG CGC AAA GAA GGC AGA GGC AAA GGG AAG GGC AAG 537 
Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lya 
5 10 

A7U3 AAG GAG CGA GGC TCC GGC -AAG AAG CCG GAG TCC GCG 576 
Lys Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala 
15 20 25 . . 

GCG GGC AGC CAG AGC CCA GCC TTG CCT CCC CAA TTG AAA 615 
Ala Gly Ser Gin Ser Pro Ala Leu Pro Pro Gin Leu Lys 
30 35 40 

GAG ATG AAA AGC CAG GAA TCG GCT GCA GGT TCC AAA CTA 654 
Glu Met Lys Ser Gin Glu Ser Ala Ala Gly Ser Lys Leu 
45 50 

GTC CTT CGG TGT GAA ACC AGT TCT GAA TAG TCC TCT CTC 693 
Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser LeiT 
55 60 65 

AGA TTC AAG TGG TTC AAG AAT GGG AAT GAA TTG AAT CGA 732 
Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg 
70 ^ * 

FIGJ4A 
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AAA- AAC AAA CCA CAA AAT ATC AAG ATA CAA AAA AAG CCA 771 
I*ys Aan Lys Pro Gin Asn lie Lys lie Gin Lys Lya Pro 
80 85 90 

GGG AAG TCA GAA CTT CGC ATT AAC AAA GCA TCA CTG GOT 810 
Gly Lys Ser Glu Leu Arg lie Asn Lys Ala Ser Leu Ala 
95 100 105 

GAT TCT GGA GAG TAT ATG TGC AAA GTG ATC AGC AAA TTA 849 
Asp Ser Gly Glu Tyr Met Cys Lys Val lie Ser Lys Leu 
110 115 

GGA AAT GAC AGT GCC TCT GCC AAT ATC ACC ATC GTG GAA 888 
Gly Asn Asp Ser Ala Ser Ala Asn lie Thr He Val Glu 
120 125 130 

TCA AAC GAG ATC ATC ACT GGT ATG CCA GCC TCA ACT GAA 927 
Ser Asn Glu lie lie Thr Gly Met Pro Ala Ser Thr Glu 
135 140 

GGA GCA TAT GTG TCT TCA GAG TCT CCC ATT AGA ATA TCA 966 
Gly Ala Tyr Val Ser Ser Glu Ser Pro He Arg He Ser 
145 150 155 

GTA TCC AC A GAA GGA GCA AAT ACT TCT TCA TCT ACA TCT 1005 
Val Ser Thr Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser 
160 165 170 

ACA TCC ACC ACT GGG ACA AGC CAT CTT GTA AAA TGT GCG 1044 
Thr Ser Thr Thr Gly Thr Ser His Leu Val Lys CyS Ala 
175 180 

GAG AAG GAG AAA ACT TTC TGT GTG AAT GGA GGG GAG TGC 1083 
Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys 
185 190 195 

TTC ATG GTG AAA GAC CTT TCA AAC CCC TCG AGA TAG TTG 1122 
Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu 
200 205 

TGC AAG TGC CCA AAT GAG TTT ACT GGT GAT CGC TGC CAA 1161 
Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gin 
210 215 220 

AAC TAC GTA ATG GCC AGC TTC TAC AAG GCG GAG GAG CTG 1200 
Asn Tyr Val Met Ala Ser Phe Tyr Lys Ala Glu Glu Leu 
225 230 235 

TAC CAG AAG AGA GTG CTG ACC ATA ACC GGC ATC TGC ATC-1239 
Tyr Gin Lys Arg Val Leu Thr He Thr Gly He Cys He 
240 245 

GCC CTC CTT GTG GTC GGC ATC ATG TGT GTG GTG GCC TAC 1278 
Ala Leu Leu Val Val Gly He Met Cys Val Val Ala Tyr 

250 crir^ lyiD 255 260 



FIG. I4B 
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TGC AAA ACC AAG AAA CAG CGG AAA AAG CTG CAT GAC CGT 1317 
Cys Lys Thr Lys Lys Gin Arg Lys Lys Leu His Asp Arg 
265 270 

CTT CGG CAG AGC CTT CGG TCT GAA CGA AAC AAT ATG ATG 1356 
Leu Arg Gin Ser Leu Arg Ser Glu Arg Asn Asn Met Met 
275 280 285 

AAC ATT GCC AAT GGG CCT CAC CAT CCT AAC CCA CCC CCC 1395 
Asn lie Ala Asn Gly Pro His His Pro Asn Pro Pro Pro 
290 295 300 

GAG AAT GTC CAG CTG GTG AAT CAA TAC GTA TCT AAA AAC 1434 
Glu Asn Val Gin Leu Val Asn Gin Tyr Val Ser Lys Asn 
305 310 

GTC ATC TCC AGT GAG CAT ATT GTT GAG AGA GAA GCA GAG 1473 
Val He Ser Ser Glu His He Val Glu Arg Glu Ala Glu 
.315 320 325 

ACA TCC TTT TCC ACC AGT CAC TAT ACT TCC ACA GCC CAT 1512 
Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His 
330 335 

CAC TCC ACT ACT GTC ACC CAG ACT CCT AGC CAC AGC TGG 1551 
His Ser Thr Thr Val Thr Gin Thr Pro Ser His Ser Trp 
340 345 350 

AGC AAC GGA CAC ACT GAA AGC ATC CTT TCC GAA AGC CAC 1590 
Ser Asn Gly His Thr Glu Ser lie Leu Ser Glu Ser His 
355 360 365 

TCT GTA ATC GTG ATG TCA TCC GTA GAA AAC AGT AGG CAC 1629 
Ser Val He Val Met Ser Ser Val Glu Asn Ser Arg His 
370 375 

AGC AGC CCA ACT GGG GGC CCA AGA GGA CGT CTT AAT GGC 1668 
Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg Leu Asn Giy 
380 385 . 390 

ACA GGA GGC CCT CGT GAA TGT AAC AGC TTC CTC AGG CAT 1707 
Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg His 
395 400 

GCC AGA GAA ACC CCT GAT TCC TAC CGA GAC TCT CCT CAT 1746 
Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His 
405 410 415 

AGT GAA AGG TAAAA CCGAAGGCAA AGCTACTGCA GAGGAGAAAC 1790 
Ser Glu Arg 
420 



FIG. !4C 
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TCAGTCAGAG AATCCCTGTG AGCACCTGCG GTCTCACCTC AGGAAATCTA 1840 
CTCTAATCAG AATAAGGGGC GGCAGTTACC TGTTCTAGGA GTGCTCCTAG 1890 
TTGATGAAGT CATCTCTTTG TTTGACGGAA CTTATTTCTT CTGAGCTTCT 1940 
CTCGTCGTCC CAGTGACTGA CAGGCAACAG ACTCTTAAAG AGCTGGGATG 1990 
CTTTGATGCG GAAGGTGCAG CACATGGAGT TTCCAGCTCT GGCCATGGGC 2040 
TCAGACCCAC TCGGGGTCTC AGTGTCCTCA GTTGTAACAT TAGAGAGATG 2090 
GCATCAATGC TTGATAAGGA CCCTTCTATA ATTCCAATTG CCAGTTATCC 2140 
AAACTCTGAT TCGGTGGTCG AGCTGGCCTC GTGTTCTTAT CTGCTAACCC 2190 
TGTCTTACCT TCCAGCCTCA GTTAAGTCAA ATCAAGGGCT ATGTCATTGC 2240 
TGAATGTCAT GGGGGGCAAC TGCTTGCCCT CCACCCTATA GTATCTATTT 2290 
TATGAAATTC CAAGAAGGGA TGAATAAATA AATCTCTTGG ATGCTGCGTO 2340 
TGGCAGTCTT CACGGGTGGT TTTCAAAGCA GAAAAAAAAA AAAAAAAAAA 2390 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA A 2431 

FIG. I4D 
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